Assay for the detection of factors that modulate the expression of inagp

ABSTRACT

A reporter construct contains mammalian INGAP 5′-regulatory region or a fragment thereof, a minimal promoter element from mammalian INGAP or a heterologous promoter, and a reporter gene. The reporter construct can be used to screen for agents which alone or in combination up-regulate or down-regulate reporter gene expression. Alternatively, the reporter construct can be used to screen for agents that bind to the hamster INGAP 5′-regulatory region or a fragment thereof.

This application incorporates by reference co-pending provisionalapplication Ser. No. 60/388,315 filed Jun. 14, 2002, Ser. No. 60/361,073filed Mar. 1, 2002, and Ser. No. 60/346,898 filed Jan. 11, 2002.

FIELD OF THE INVENTION

The invention relates to the field of assays for the detection offactors that modulate gene expression. Specifically, the inventionrelates to reporter constructs and methods for identifying agents thatmodulate the expression of the INGAP gene.

BACKGROUND OF THE INVENTION

Islet neogenesis gene associated protein (INGAP protein) has beenidentified as a pancreatic acinar cell protein that can induce isletcell neogenesis from progenitor cells resident in the pancreas in amanner that recapitulates islet development during normal embryogenesis.INGAP is unique in its ability to stimulate growth and differentiationof islets of Langerhans from precursor cells associated with pancreas.These islets evolve a mature insulin secretory profile capable ofresponding to perturbations in blood glucose in a physiologic manner.This potential anti-diabetic therapeutic has been shown to demonstratehomology across several species and to exert a biological response.

Pancreatic islet cell mass is lost in type 1 diabetes mellitus, adisease in which a progressive autoimmune reaction results in theselective destruction of insulin-producing β-cells. In type 2 diabetesmellitus, so-called adult-onset disease, but also increasingly acondition in young overweight people, the β-cell mass may be reduced byas much as 60% of normal. The number of functioning β-cells in thepancreas is of critical significance for the development, course, andoutcome of diabetes. In type I diabetes, there is a reduction of β-cellmass to less than 2% of normal. Even in the face of severe insulinresistance as occurs in type II diabetes, the development of diabetesonly occurs if there is inadequate compensatory increase in β-cell mass.Thus, the development of either of the major forms of diabetes can beregarded as a failure of adaptive β-cell growth and a subsequentdeficiency in insulin secretion. Stimulating the growth of islets andβ-cells from precursor cells, known as islet neogenesis, is anattractive approach to the amelioration of diabetes. There is need inthe art for methods to identify agents that can modulate the expressionof INGAP, whether in animals or in cultured cells.

BRIEF SUMMARY OF THE INVENTION

It is an object of the invention to provide a reporter constructcontaining the 5′-regulatory region from mammalian INGAP gene.

It is another object of the invention to provide methods for identifyingagents which modulate INGAP expression.

It is another object of the invention to provide a nucleic acid orfragment of INGAP 5′-regulatory region.

It is another object of the invention to provide methods for increasingINGAP expression.

It is another object of the invention to provide a kit for modulatingINGAP expression.

These and other objects of the invention are provided by one or more ofthe embodiments described below.

In one aspect of the invention a reporter construct is provided. Thereporter construct comprises a regulatory region nucleotide sequence anda nucleotide sequence encoding a detectable product. In one aspect ofthe invention, the reporter construct is provided in a vector. Theregulatory region nucleotide sequence is linked to the nucleotidesequence encoding a detectable product. The regulatory region nucleotidesequence may comprise one or more fragments of 5′ regulatory region ofthe INGAP genomic sequence, SEQ ID NO: 23, or it may comprise the entirelength of the 5′ regulatory region. In one embodiment of the reporterconstruct, a promoter element is interposed between the regulatoryregion nucleotide sequence and the nucleotide sequence encoding adetectable product. The promoter element may be selected from thepromoter elements present in the INGAP regulatory sequence.Alternatively, the promoter element present in the vector comprising thereporter construct may be used. The detectable product encoded by thesaid nucleotide sequence encoding a detectable product could be either anucleic acid or a protein. The detectable product need not be the INGAPgene nucleic acid or protein.

In another embodiment of the invention, a method identifying agents thatmodulate INGAP expression is provided. The method comprises contacting acell with a test agent, wherein the cell comprises a reporter constructof the present invention. Expression of the detectable nucleic acid orprotein product in the cell is determined. A test agent is identified asa modulator of INGAP expression if the test agent modulates expressionof the detectable product in the cell.

In another embodiment of the invention, an isolated nucleic acidcomprising the genomic sequence of the hamster INGAP gene (SEQ ID NO:2), or a fragment thereof is provided.

According to another embodiment of the invention, an in vitro method foridentifying agents that modulate INGAP expression is provided. Themethod comprises contacting a test agent with a reporter construct ofthe present invention in a cell-free system that allows fortranscription and translation of a nucleotide sequence. Expression ofthe detectable product is determined. The substance is identified as amodulator of INGAP expression if the test substance modulates expressionof the detectable product.

According to another embodiment of the invention, an in vitro method foridentifying an agent that modulate INGAP expression is provided. Themethod comprises contacting a test agent with a nucleic acid of theinvention. Binding of the test agent to the nucleic acid is determined.The test agent is identified as a modulator of INGAP expression if thetest agent binds to the nucleic acid.

According to another embodiment of the invention a method for increasingINGAP expression is provided. An effective amount of a factor thatstimulates INGAP expression directly or indirectly, for examplecytokines, chemokines, growth factors, or pharmacological agents, isadministered to a mammal in need of increased INGAP expression.

According to another embodiment of the invention a kit for modulatingINGAP expression is provided. The kit comprises a modulator of INGAPexpression and instructions for using the modulator of INGAP expressionto modulate INGAP expression.

According to another embodiment of the invention a method for modulatingINGAP expression in a mammal to treat a disease state related to reducedislet cell function is provided. The method comprises the step ofadministering to the mammal an effective amount of a modulator of INGAPexpression whereby the level of INGAP expression in the mammal ismodified.

All documents cited are, in relevant part, incorporated herein byreference; the citation of any document is not to be construed as anadmission that it is prior art with respect to the present invention.

BRIEF DESCRIPTION OF EE DRAWINGS

FIG. 1 shows the annotation of the hamster INGAP gene structure. Theboundaries of introns 1-5 are listed in Table 1.

FIG. 2 shows an overview of the 5′-regulatory region of the hamsterINGAP gene (nucleotides 1-3137 of SEQ ID NO: 2) showing many well knownand well-characterized transcription factor binding sites. The minimalpromoter element contains the regions noted with an underline (CAAT-box,TATA-box, and GC-box).

FIG. 3 shows a schematic of many well known and well-characterizedtranscription factor-binding sites for nucleotides 1-3123 of the5′-regulatory region (SEQ ID NO: 1) of the hamster INGAP gene. Table 3further describes these transcription factor-binding sites.

FIG. 4 shows the predicted transcription start sites within the5′-regulatory region (SEQ ID NO: 1) of the hamster INGAP gene (SEQ IDNO: 2). The predicted start site is indicated by a boldface nucleotide.The start and end nucleotide numbers are indicated for the promotersequence. The numbers refer to nucleotide numbers of the hamster INGAPgene (SEQ ID NO: 2)

FIG. 5 shows the adapter primer structure and sequence used in genewalking. Adapter primer 1 (AP1) and adapter primer 2 (AP2) are shown.

FIGS. 6 and 7 show the strategy for reconstructing the hamster INGAPgene. The hamster INGAP gene was reconstructed using the technique ofgene walking. Shown are the fragments and the gene specific primers(GSP1 and GSP2) used in PCR amplification for gene walking. Fragmentswere joined together using unique restriction enzyme sites within eachfragment. The nucleotide sequences of the individual primers are listedin Table 2.

FIG. 8 shows the fragments of INGAP 5′-regulatory region, which werecloned into pβGal-basic upstream of a β-galactosidase reporter gene. Thelabels on the left refer to the nucleotide fragments of SEQ ID NO: 23which were cloned upstream of pβGal-basic.

FIG. 9A shows reporter activity in human embryonic kidney cells (293T)transfected with a reporter construct that contains various fragments ofthe 5′-regulatory region (SEQ ID NO: 23) of hamster INGAP DNA clonedupstream of a β-galactosidase reporter gene (pβGal-basic), or in areporter construct which contains no INGAP DNA. The cells are stimulatedwith phorbol myristate acetate. Promoter activity is assessed bydetermining the level of β-galactosidase present in the cell using aβ-galactosidase luminescent assay.

FIG. 9B shows reporter activity in human embryonic kidney cells (293T)transfected with a reporter construct that contains nucleotides 2030 to3137 of the 5′-regulatory region (SEQ ID NO: 23) of hamster INGAP clonedupstream of a β-galactosidase reporter gene, or in a reporter constructwhich contains no INGAP DNA. The cells are stimulated with leukemiainhibitory factor. Promoter activity is assessed by determining thelevel of β-galactosidase present in the cell using a β-galactosidaseluminescent assay.

FIG. 10 shows the reporter activity in human embryonic kidney cells(293T) transfected with a reporter construct that contains differentfragments (see FIG. 8) of the 5′-regulatory region of hamster INGAPcloned upstream of a β-galactosidase reporter gene. The cells arestimulated with phorbol myristate acetate. Concentrations of PMA usedare 6 ng/ml, 17 ng/ml, 50 ng/ml, 100 ng/ml, or 300 ng/ml. Promoteractivity is assessed by determining the level of β-galactosidase presentin the cell using a β-galactosidase luminescent assay.

FIG. 11 shows reporter activity in human embryonic kidney cells (293T)transfected with a reporter construct that contains different fragments(see FIG. 8) of the 5′-regulatory region of hamster INGAP clonedupstream of a β-galactosidase reporter gene. The cells are stimulatedwith human leukemia inhibitory factor (hLIF). Concentrations of hLIFused are 1 ng/ml, 10 ng/ml, or 30 ng/ml. Promoter activity was assessedby determining the level of β-galactosidase present in the cell using aβ-galactosidase luminescent assay.

FIG. 12 shows RNA analysis for INGAP gene upregulation in rat amphicrinepancreatic cells, AR42J, treated with cytokine IL-6 or untreated. TotalRNA is probed by Northern analysis for INGAP gene.

DETAILED DESCRIPTION OF THE INVENTION Definitions

It must be noted that as used herein and in the appended claims, thesingular forms “a”, “an”, and “the” include plural references unless thecontext clearly dictates otherwise.

The term “promoter” is used to define the region of a gene at whichinitiation and rate of transcription are controlled. It contains thesite at which RNA polymerase binds and also sites for the binding ofregulatory proteins, e.g. transcription factors, repressors, etc. Inorder to differentiate between the transcription initiation site andother sites that modulate rate of transcription, promoter region isgenerally subdivided into “minimal promoter element” and “regulatoryregion”. The term “minimal promoter element” or sometimes simplyreferred to as “promoter” therefore may include TATA box, GC-richsequence and CAAT box; while “regulatory region” is usually a longstretch of nucleotide sequence where transcription factors and otherfactors bind. Most eukaryotic genes have long regulatory regions wheremany different transcription factors bind. The expression or the lack ofexpression of a given gene in a given cell type, tissue, organ, or anorganism is governed by the interactions that take place on itsregulatory region.

The term “transcription factor” is used to describe the proteins thatbind short stretches of DNA in the regulatory regions of a gene.Transcription factors may interact with each other as well as RNApolymerase. Thus, transcription factors may bind hormones or secondmessengers, DNA, RNA, other transcription factors, or other proteins.They may activate or inhibit transcription of a given gene.Transcription factors are also sometimes referred to as “enhancers” or“repressors”. Transcription factor binding sites can be used to identifyagents that bind to the 5′-regulatory region of the gene and modulatethe gene's expression.

The term “reporter” is used to describe a coding sequence attached to aheterologous promoter or enhancer elements and whose product, eithernucleic acid or protein, is easily detected and is quantifiable. Somecommon reporter genes include β-galactosidase (lacZ), chloramphenicolacetyltransferase (cat), β-glucuronidase (GUS), and green fluorescentprotein (GFP).

A “reporter construct” is a piece of nucleic acid that includes apromoter element and a reporter gene housed in a suitable vector plasmidDNA. Regulatory region nucleotide sequences may be cloned 5′ of thepromoter element to determine if they contain transcription factorbinding sites. The reporter construct-containing vector is introducedinto a cell that contains many transcription factors. Activation of thereporter gene by transcription factors may be monitored by detection andquantification of the product of the reporter gene.

The term “agent” is used here to essentially describe any means tomodulate INGAP expression. Agent may be a chemical compound, abiological agent, or a physical force, a mechanical contraption, or anycombinations thereof.

INGAP Promoter and Regulatory Region

It is a discovery of the present inventors that INGAP gene is regulatedby a 5′-regulatory region that is susceptible to modulation by manyknown transcription factors, including PMA and LIF.

It is a further discovery of the present invention that the5′-regulatory region nucleotide sequence of the INGAP gene may be usedin screening assays to identify agents capable of modulating the INGAPgene expression. These modulating agents have potential as therapeuticagents for treating pathological conditions including, but not limitedto, diabetes mellitus, both type 1 and type 2, endocrine andnon-endocrine hypoplasia, hypertrophy, adenoma, neoplasia, andnesidioblastosis.

Mammalian INGAP, like most genes, has a 5′-regulatory region followed byintrons and exons. The sequence of a mammalian (Hamster sp.) INGAP geneis provided as SEQ ID NO: 2. FIG. 1 details the relative location of the5′-regulatory region, the introns and the exons of the hamster INGAPgene. The boundaries of introns 1-5 and the location of the TATA-box andthe poly-A signal are listed in Table 1.

TABLE 1 Position In INGAP Gene Description (SEQ ID NO: 2) TATA-Box 3094INTRON 1 3150-3426 INTRON 2 3508-4442 INTRON 3 4562-4735 INTRON 44874-5459 INTRON 5 5587-5843 Poly-A Signal 6098-6103

The nucleotide sequence of the 5′-regulatory region including thepromoter elements of mammalian INGAP, is shown partially in SEQ ID NO:1, and completely in SEQ ID NO: 2 and 23 (nucleotides 1-3137 of SEQ IDNO: 2). Nucleotides 1-3120 of SEQ ID NO: 1 are identical to nucleotides1-3120 of SEQ ID NO: 2 and SEQ ID NO: 23. An overview of the5′-regulatory region is shown in FIG. 2. Representative transcriptionenhancer/repressor binding sites are shown also in FIG. 2. Predictedtranscription enhancer/repressor binding sites for nucleotides 1-3123 ofthe 5′-regulatory region are shown in FIG. 3. Table 3 at the end of thespecification details these transcription factors and their bindingsites, and their locations in the regulatory region. Potentialtranscription factor binding analysis was done using MatInspectorprofessional, which is a bioinformatics software that utilizes a libraryof matrix descriptions for transcription factor binding sites to locatematches in sequences of unlimited length (Quandt, K., Frech, K., Karas,H., Wingender, E., Werner, T. (1995) Nucleic Acids Res. 23, 4878-4884).

Table 3 lists predicted binding proteins (Further Information) basedupon their classification into functionally similar matrix families(Family/matrix). The DNA sequence predicted to bind the protein(Sequence), whether sense or antisense DNA (Str) and location of thesequence in SEQ ID NO: 2, (Position) are listed. Further the similarityto the consecutive highest conserved nucleotides of a matrix (Core sim.)and similarity to all nucleotides in that matrix (Matrix sim.) alongwith the optimized value (Opt) defined in a way that a minimum number ofmatches is found in non-regulatory test sequences are also listed.Details to the algorithms used in MatInspectorprofessional™ isreferenced:

OPT: This matrix similarity is the optimized value defined in a way thata minimum number of matches are found in non-regulatory test sequences(i.e. with this matrix similarity the number of false positive matchesis minimized). This matrix similarity is used when the user checks“Optimized” as the matrix similarity threshold for MatInspectorprofessional™.

Family: Each matrix belongs to a so-called matrix family, wherefunctionally similar matrices are grouped together, eliminatingredundant matches by MatInspector professional™ professional (if thefamily option was selected). E.g. the matrix family V$NFKB includes 5similar matrices for NFkappaB (V$NFKAPPAB.01, V$NFKAPPAB.02,V$NFKAPPAB.03, V$NFKAPPAB50.01, V$NFKAPPAB65.01) as well as 1 matrix forthe NFkappaB related factor c-Rel (V$CREL.01).

Matrix: The MatInspector professional™ matrices have an identifier thatindicates one of the following seven groups: vertebrates (V$), insects(I$), plants (P$), fungi (F$), nematodes (N$), bacteria (B$), and otherfunctional elements (0$); followed by an acronym for the factor thematrix refers to, and a consecutive number discriminating betweendifferent matrices for the same factor. Thus, V$OCT1.02 indicates thesecond matrix for vertebral Oct-1 factor.

Core Sim: The “core sequence” of a matrix is defined as the (usually 4)consecutive highest conserved positions of the matrix. The coresimilarity is calculated as described here. The maximum core similarityof 1.0 is only reached when the highest conserved bases of a matrixmatch exactly in the sequence. More important than the core similarityis the matrix similarity which takes into account all bases over thewhole matrix length.

Matrix Sim: The matrix similarity is calculated as described here. Aperfect match to the matrix gets a score of 1.00 (each sequence positioncorresponds to the highest conserved nucleotide at that position in thematrix), a “good” match to the matrix usually has a similarity of >0.80.Mismatches in highly conserved positions of the matrix decrease thematrix similarity more than mismatches in less conserved regions.

Another aspect of the invention provides for a reporter construct.Reporter constructs contain a 5′ regulatory region nucleotide sequencefragment of SEQ ID NO: 23 (e.g., an enhancer and/or repressor bindingsite containing region), a promoter element (which may or may not befrom INGAP regulatory region nucleotide sequence, SEQ ID NO: 23), and areporter gene. The 5′-regulatory region nucleotide sequence ispositioned upstream of the reporter gene. In order to determine theidentity of various transcription factors that bind the 5′ regulatoryregion nucleotide sequence and to elucidate their binding locationswithin the 5′ regulatory nucleotide sequence of the INGAP gene, theregion may be mapped using deletion analysis. One or more fragments ofthe regulatory region nucleotide sequence may be initially analyzed fortheir responses to various transcription factor activators. Once, aregion of interest is determined, further fine mapping may be carriedout where DNA from different locations within the regulatory regioncould be combined to make a more robust, and responsive reporterconstruct. DNA sequences, such as INGAP 5′-regulatory region DNA or afragment thereof, can be manipulated by methods well known in the art.Examples of such techniques include, but are not limited to, polymerasechain reaction (PCR), restriction enzyme endonuclease digestion,ligation, and gene walking. Cloning fragments of DNA, such as5′-regulatory regions is well known in the art.

Another approach to quantify the expression levels of a gene is tomeasure transcription of the gene. PCR-ELISA may be used to capturetranscripts onto a solid phase using biotin or digoxigenin-labelledprimers, oligonucleotide probes (oligoprobes) or directly afterincorporation of the digoxigenin into the transcripts (Watzinger, F. andLion, T. (2001) Nucleic Acids Res., 29, e52). Once captured, thetranscripts can be detected using an enzyme-labeled avidin oranti-digoxigenin reporter molecule similar to a standard ELISA formatAnother approach is to employ real-time PCR to detect the transcript ofthe reporter gene (Mackay, I. M. and Nitsche, A., Nucleic Acids Res.2002 Mar. 15; 30(6), 1292-305). In real-time PCR fluorogenic nucleotidesare used and progress of the transcript is monitored in real-time as thepolymerase transcribes the reporter gene.

The promoter element in the reporter construct may or may not be fromthe same gene as the 5′-regulatory region. As an example, theenhancer/repressor region from the INGAP 5′-regulatory region, or afragment of the enhancer/repressor region from the INGAP 5′-regulatoryregion, may be cloned upstream of a heterologous minimal promoterelement, e.g., the minimal CMV promoter (Boshart et al., 1985) and thepromoters for TK (Nordeen, 1988), IL-2, and MMTV.

Transcription of a gene begins around the minimal promoter. FIG. 4 showsthe predicted transcription start sites for mammalian INGAP gene (SEQ IDNO: 2). SEQ ID NO: 2 was analyzed using “Neural Network PromoterPrediction” program designed by Martin Reese to identify eukaryoticpromoter recognition elements such as TATA-box, GC-box, CAAT-box, andthe transcription start site. These promoter elements are present invarious combinations separated by various distances in sequence. Theprogram is available on the Internet and is located athttp://www.fruitfly.org/seq_tools/promoter.html.

The reporter construct can be used to identify agents that modulate,either alone or in combination, the expression of INGAP. Some suchagents may modulate expression of INGAP by binding to the regulatoryregion directly while others may regulate expression of transcriptionfactors that bind to the INGAP regulatory region.

The reporter construct can be transfected into a host cell in vitro, orin vivo through the pancreatic duct, either transiently or stably, and atest agent introduced to the assay system. Examples of test agentsinclude, but are not limited to organic and inorganic chemical agents,carbohydrates, proteins, oligonucleotides, cholecystokinin, mechanicallyinduced pressure, and agents which cause a pancreatic duct obstruction.Expression of the reporter gene product can be determined by an assayappropriate for the reporter gene employed. Examples of such assaysinclude, but are not limited to a luminescent assay for β-galactosidaseor luciferase, an enzymatic assay for chloramphenicol acetyltransferase, and fluorescence detection for fluorescent proteins. Suchassays are well known in the art, and a skilled artisan will be able toselect an appropriate assay for the chosen reporter. A test agent isidentified as a modulator of INGAP expression if the test agentmodulates expression of the reporter gene product. Preferably the levelof increase or decrease is at least 50%, 100%, 200%, 500%, or 1000%, butany statistically significant change can be an indicator of modulatoryactivity. A skilled artisan may also determine reporter gene productexpression in untreated cells, and in treated and untreated cellstransfected with a promoter-less reporter gene only. Such determinationscan be used to determine background levels of expression.

Test agents can also be obtained by fractionating pancreatic secretionfluids. A pancreatic duct obstruction can be used as an exemplary methodof harvesting pancreatic secretion fluids. The pancreatic secretionfluids can be fractionated by methods well known in the art. Examplesinclude high-pressure liquid chromatography (HPLC), size exclusionchromatography, hydrophobic interacting columns, and density gradientcentrifugation. Individual fractions can be tested for agents thatmodulate reporter gene expression using a method described herein. Theindividual fractions can be further fractionated to identify agents thatmodulate reporter gene expression. The identified test agents can beused to modulate the expression of INGAP.

A host cell can be any cell suitable for transfection and maintenance ina suitable assay system. Examples of suitable cells include, but are notlimited to, mammalian cells, human cells, mouse cells, rat cells, monkeycells, dog cells, bovine cells, and porcine cells. Preferably the cellsused will be human cells. The cells could be either transformed cellsline or primary cells. Whole organ explants may also be used where theregulation may be monitored over many different cell types. Many methodsexist in the art for transfecting or infecting cells with reporterconstruct DNA. Such methods include, but are not limited to,lipofection, electroporation, calcium phosphate precipitation, DEAEdextran, gene guns, and modified viral techniques (e.g., recombinantadenovirus or recombinant retrovirus). The skilled artisan can readilychoose a method suitable for use with a given cell type and assaysystem.

The reporter construct can also be introduced in vivo directly intocells of the pancreas. Examples of methods to introduce the reporterconstruct into pancreatic cells in vivo include pancreatic ductretrograde perfusion and in vivo electroporation (Mir, 2001). Thereporter construct encodes a reporter gene product that is readilymeasured in vivo. A test agent can be administered systemically orlocally, and N expression of the reporter gene in vivo can be determinedby an assay appropriate for the particular reporter employed. Examplesof such include a fluorescence assay for green fluorescent protein.

Methods for identifying agents that modulate INGAP expression can alsobe accomplished in vitro. The reporter construct can be contacted with atest agent in vitro under conditions sufficient for transcription and/ortranslation of the reporter gene. Components such as rabbit reticulocytelysates or wheat germ extracts can be utilized for such a method.Subsequently, the expression level of the Reporter gene can bedetermined as described above utilizing an appropriate assay for a givenreporter gene. A test agent is identified as a modulator of INGAPexpression if the test agent modulates expression of the reporter gene.Threshold levels of change can be set by the practitioner as discussedabove.

A test agent can alternatively be contacted with an isolated andpurified INGAP 5′-regulatory DNA molecule and one can determine if thetest agent binds to the DNA molecule. Test agents can be a chemicalagent, a protein, or a nucleic acid. Appropriate INGAP 5′-regulatory DNAmolecules would include nucleotides 1-6586 of SEQ ID NO: 2, the5′-regulatory region DNA (SEQ ID NO: 1, or SEQ ID NO: 23), or anyfragment of the 5′-regulatory region, preferably a fragment whichcontains one or more enhancer/repressor binding sites. Methods todetermine binding of the test agent to the fragment of DNA are wellknown in the art, e.g., electrophoretic mobility shift assay (EMSA). Seefor example Sambrook et al., MOLECULAR CLONING: A LABORATORY MANUAL, 2ded., 1989, at pages 9.50-9.51. Fragments of the 5′-regulatory region canbe obtained by methods well known in the art using the disclosedsequence (SEQ ID NO: 2). Examples of such methods include, PCR,restriction enzyme digestion, and chemical synthesis. Any fragment ofDNA within the 5′-regulatory region (SEQ ID NO: 1, or 23) can be used.The exact location that an agent binds can be determined for example byutilizing smaller fragments to map precisely the binding site for thetest agent. Test agents that bind in the assay can be further tested inother assays that require modulatory activity.

An agent that causes an increase or decrease in reporter gene expressioncan be used as a modulator of INGAP expression. The modulator can beadministered to a mammal in need of such modulation. Examples of mammalsthat may need INGAP expression modulation are those with reducedpancreatic function, in particular reduced islet cell function. Suchmammals include those who have diabetes mellitus, impaired glucosetolerance, impaired fasting glucose, hyperglycemia, obesity, andpancreatic insufficiency.

An agent that is identified as a modulator of INGAP expression can besupplied in a kit to treat diseases associated with reduced islet cellfunction. The kit would comprise in single or divided containers, insingle or divided doses a modulator of INGAP expression. Writteninstructions may be included for using the modulator of INGAPexpression. The instructions may simply refer a reader to anotherlocation such as a website or other information source.

Agents that cause an increase in reporter gene expression can be used toincrease INGAP expression to treat a disease state related to reducedislet cell function. Agents that cause a decrease in reporter geneexpression can be used to decrease INGAP expression to treat a diseasestate related to hyperactivity of islet cells or a disease where reducedINGAP expression is desirable. Examples of such agents include, but arenot limited to, PMA, LIF, interleukin-6, Oncostatin M, and ciliaryneurotropic factor. Agents can be administered by any number of routesincluding, but not limited to, oral, intravenous, intramuscular,intra-arterial, intramedullary, intrathecal, intraventricular,transdermal, subcutaneous, intraperitoneal, intranasal, parenteral,topical, sublingual, rectal, or pancreatic duct retrograde perfusion.Agents for oral administration can be formulated using pharmaceuticallyacceptable carriers well known in the art in dosages suitable for oraladministration. Such carriers enable the pharmaceutical compositions tobe formulated as tablets, pills, dragees, capsules, liquids, gels,syrups, slurries, suspensions, and the like, for ingestion by themammal. Agents for intravenous, intramuscular, intra-arterial,transdermal, and subcutaneous injections can be formulated usingpharmaceutically acceptable carriers well known in the art in dosagessuitable for injection into the mammal. Agents for intranasal, topical,and rectal administration can be formulated using pharmaceuticallyacceptable carriers well known in the art in dosages suitable forsurface administration to the mammal. Mammals in need of an increase inINGAP expression include for example, mammals with diabetes mellitus,impaired glucose tolerance, impaired fasting glucose, hyperglycemia,obesity, and pancreatic insufficiency. Mammals in need of a decrease inINGAP expression include for example, mammals with hypoglycemia.

The following examples are offered by way of illustration and do notlimit the invention disclosed herein.

EXAMPLES Example 1 Hamster INGAP Genomic Sequence and Structure

The hamster INGAP genomic sequence and structure was determined by genewalking (Clontech) and DNA sequencing. Gene walking is a method forwalking upstream toward a promoter or downstream in genomic DNA from aknown sequence, such as cDNA. This method utilizes four uncloned,adapter-ligated genomic fragment libraries. The manufacturer'srecommended protocol is followed with one notable exception; hamstergenomic DNA was used to create the uncloned, adapter-ligated genomicfragment libraries.

To create uncloned, adapter ligated genomic fragment libraries, genomicDNA was purified from hamster cells. Four separate aliquots werethoroughly digested with PvuII, StuI, DraI, or EcoRV. Followingdigestion, inactivation of the restriction enzymes, anddephosphorylation, each separate pool of DNA fragments was ligated to anadapter, see FIG. 5. The adapter was phosphorylated to provide therequisite phosphate group for a ligation reaction. Also note that the3-prime side of the short adapter contains an amine group to prevent theadapters from forming concatamers.

Two gene specific primers (GSP1 and GSP2) were designed for each regionof known sequence (i.e., the exons of the INGAP gene). See FIG. 6 forfragment location and GSP1 and GSP2 location. The gene specific primerswere designed as reverse PCR primers for all fragments except fragments1_(—)2 and 14_(—)5. The gene specific primers for fragments 1_(—)2 and14_(—)5 were designed as forward primers. Adapter primer 1 (AP1) andadapter primer 2 (AP2) (FIG. 5) were forward PCR primers for allfragments except fragments 1_(—)2 and 14_(—)5, which were reverse PCRprimers. The outer gene specific primer (GSP1) was used with adapterprimer 1 in a PCR reaction. To increase specificity, a second, nestedPCR was set up using the inner gene specific primer (GSP2) and adapterprimer 2. A small aliquot of the first reaction served as template forthe second reaction. Gene specific PCR primers utilized for gene walkingare listed in Table 2 and the strategy used to build the INGAP genomicsequence is shown in FIGS. 6 and 7. The arrowheads in FIG. 6 representthe adapter primers (AP1 and AP2), while the circles represent the genespecific primers (GSP1 and GSP2).

TABLE 2 NAME (LOCATION) SEQUENCE INGEN 21_3 5′-ACAAGCAATCTAGAGATGG-3′(1464, 1482) (SEQ ID NO: 3) INGEN 19_3 5′-GTTCAGCTATGTTCATAGCAGGG-3′(1401, 1423) (SEQ ID NO: 4) INGEN 16_3 5′-GTCTGTATGACTGTGTGGGAAG-3′(1855, 1876) (SEQ ID NO: 5) INGEN 15_3 5′-GCACTTGAACTCAATGGCTC-3′ (1929,1948) (SEQ ID NO: 6) INGEN 14_3 5′-GAACCACCTGACATGGGTGATG-3′ (2147,2168) (SEQ ID NO: 7) INGEN 13_3 5′-GGGCATCGTATCATCTGGTTACAG-3′ (2177,2200) (SEQ ID NO: 8) INGEN 8_3 5′-GGTTCAAAAAAGCTGCTTCAAC-3′ (2544, 2565)(SEQ ID NO: 9) INGEN 7_3 5′-GGAATAGCTGCAATTTATGCCCAT-3′ (2666, 2689)(SEQ ID NO: 10) INGEN 4_3 5′-CTTAGGAACATTCAGGCAGCCTCCTG-3′ (2833, 2858)(SEQ ID NO: 11) INGEN 3_3 5′-GTTGCCCTCTGCCACGTGTCAAGTTC-3′ (2866, 2891)(SEQ ID NO: 12) INGEN 2_3 5′-CATCCAAGACATCCTACAGAGGGTCAT-3′ (3444, 3470)(SEQ ID NO: 13) INGEN 1_3 5′-CCCAAGAAAGGAACATCAGGCAGGAAA-3′ (3475, 3501)(SEQ ID NO: 14) INGEN 2_2 5′-CCAAATGAGTGCTTCCCTGAA-3′ (3330, 3350) (SEQID NO: 15) INGEN 1_2 5′-GCAGCACTCTGAAACTCAGTAGAGTT-3′ (3241, 3266) (SEQID NO: 16) INGEN 14_5 5′-GCTGCTGACCGTGGTTATTG-3′ (5544, 5563) (SEQ IDNO: 17) INGEN 13_5 5′-ACACTACCCAACGGAAGTGGATG-3′ (5463, 5485) (SEQ IDNO: 18) INGAP1_1L 5′-TTTCCTGCCTGATGTTCC-3′ (3475, 3492) (SEQ ID NO: 19)INGAP1_1R 5′-TCATACTTGCTTCCTTGTCC-3′ (5957, 5976) (SEQ ID NO: 20)INGAP2_1L 5′-CTTCACGTATAACCTGTCC-3′ (4470, 4488) (SEQ ID NO: 21)INGAP2_1R 5′-ATTAGAACTGCCCTAGACC-3′ (5905, 5923) (SEQ ID NO: 22)

The PCR fragments were sequenced to determine the nucleotide sequence ofthe INGAP 5′-regulatory region, the introns, the intron/exon junctions,and the 3-prime polyadenylation regions. The nucleotide sequence ofhamster INGAP genomic DNA is shown in SEQ ID NO: 2.

Example 2 Cloning Hamster INGAP 5′-Regulatory Region Fragment into aReporter Construct

To construct the INGAP 5′-regulatory region, individual PCR fragmentswere joined together at unique restriction sites located within twoadjoining fragments. FIGS. 6 and 7 detail the strategy used to piece theINGAP 5′-regulatory region together. Fragments 8_(—)3 and 2_(—)3 werejoined at a unique SphI site; 14_(—)3 and 8_(—)3 were joined at a uniqueBbsI site; 16_(—)3 and 14_(—)3 were joined at a unique PstI site. Thenucleotide sequence of hamster INGAP 5′-regulatory region DNA is shownin SEQ D NO: 1 and 23 in the sequence listing.

The hamster INGAP 5′-regulatory region or a fragment of the5′-regulatory region was cloned into a reporter plasmid, pβGal-Basic(Clontech). The 5′-regulatory region or fragments were cloned utilizingthe unique XmaI site from the gene walking adapter primer and a uniqueBglII site located at the 3-prime side of the regulatory region. FIG. 8details the fragments cloned into pβGal-Basic. The sizes of thefragments are indicated to the right of the fragments and are expressedas the number of nucleotides of the fragment.

Example 3 Assay System to Screen for Factors that Modulate theExpression of INGAP

Promoter analysis of INGAP identified a number of potentialpromoter-proximal regulatory sites including the consensus transcriptionfactor binding sites; cAMP response element (CRE), AP-1 and STAT.Promoter-fragment reporter-gene constructs were transiently transfectedinto 293T cells and co-transfection of secretory alkaline phosphatasewas used to normalize for transfection efficiency.

Reporter constructs containing INGAP 5′-regulatory region fragments2_(—)3sP (SEQ ID NO: 37), 2_(—)3dP (SEQ ID NO: 38), 2_(—)3pP (SEQ ID NO:36), 14_(—)3P (SEQ ID NO: 34), 16_(—)3P (SEQ ID NO: 31), or 19_(—)3P(SEQ 11D NO: 23) were transfected into human cells. The pβGal-Basicplasmid without the hamster INGAP DNA was also transfected into humancells as a control to measure the level of endogenous reporter activity.Two days following transfection, the cells were treated with PMA for 24hours or were untreated. To determine the level of promoter activity,the amount of β-galactosidase gene product was determined using aluminescent assay for β-galactosidase. FIG. 9A shows that construct14_(—)3P activated the INGAP expression the most, followed by 2_(—)3pP,and 16_(—)3P.

Reporter construct containing INGAP 5′-regulatory region DNA nucleotides2030 to 3120 was transfected into human cells. The pβGal-Basic plasmidwithout the hamster INGAP DNA was also transfected into human cells as acontrol to measure the level of endogenous reporter activity. Two daysfollowing transfection, the cells were treated with LIF for 24 hours orwere untreated. To determine the level of promoter activity, the amountof β-galactosidase gene product was determined using a luminescent assayfor β-galactosidase. FIG. 9B shows the results. LIF was determined toincrease the activity of the 5′-regulatory region of mammalian INGAP.Forskolin (an activator of cAMP/CREB/CRE) did not modulate geneexpression (data not shown).

It is important to note that when present in human cells, the hamsterINGAP 5′-regulatory region is transactivated by the human transcriptionfactors. Thus, linked to a reporter gene, the 5′-regulatory region ofhamster INGAP creates a sensitive assay system to screen for factorsthat modulate the expression of INGAP.

Example 4 Determination of Approximate Location of PMA and LIF-MediatedTranscription Factor Binding in the 5′-Regulatory Region

To map the approximate location of PMA-initiated or LIF-initiatedtranscription factor binding different fragments of the hamster INGAP5′-regulatory region were cloned into pβGal-Basic. See FIG. 8. Thefragments cloned into the reporter construct were 2_(—)3sP (SEQ ID NO:37), 2_(—)3dP (SEQ ID NO: 38), 2_(—)3pP (SEQ ID NO: 36), 14_(—)3P (SEQID NO: 34), 16_(—)3P (SEQ ID NO: 31), or 19_(—)3P (SEQ ID NO: 23). Thereporter constructs were transfected into human cells. Two daysfollowing transfection, the cells were treated with differentconcentrations of PMA or LIF for 24 hours. The concentrations of PMAused were 6 ng/ml, 17 ng/ml, 50 ng/ml, 100 ng/ml, or 300 ng/ml. Theconcentrations of LIF used were 1 ng/ml, 10 ng/ml, or 30 ng/ml. Todetermine the level of promoter activity, the amount of β-galactosidasegene product was determined using a luminescent assay forβ-galactosidase. FIGS. 10 and 11 show the results for PMA and LIFtreatment, respectively. Both PMA and LIF activated the cell reporterconstructs. The exact location of the DNA contact sites can be narrowedfurther by cloning smaller fragments of the hamster INGAP 5′-regulatoryregion and by site directed mutations or deletions.

Example 5 RNA Analysis of INGAP Gene Upregulation

To determine if INGAP RNA levels increase after stimulation with acytokine that signals through STAT, rat amphocrine pancreatic cells,AR42J were treated with IL-6 (1000 U/ml) for 24 hours. Total RNA wasextracted from the treated and untreated cells using techniques wellknown in the art, e.g., using TRI_(ZOL)® reagent.

Equal amounts of total RNA (10 μg) were loaded in 2.5% formaldehyde geland electrophoresed for 4 hours at 70V with a constant circulation ofthe buffer using a circulating pump. The gel was photographed and washedwith water twice at room temperature and soaked in 20×SSC. The gel wastransferred to a nylon membrane (Amersham) in 20×SSC overnight followinga standard procedure. The membrane was washed with 20×SSC to remove anyagar that might have attached to the membrane and baked for 4 hours at80° C.

One hundred nanograms of hamster INGAP cDNA was labeled using RandomPrime Labeling kit (Roche-BMB) and alpha-P³² dCTP (ICN). Approximately20 million counts were used for hybridization in 20 ml hybridizationbuffer following the standard procedure at 42° C. for overnight. Theblot was washed as follows: 2-times at room temperature with 2×SSC for10 minutes each; 2-times at 42° C. with 2×SSC for 10 minutes each;2-times at 55° C. with 1×SSC for 10 minutes each. The membrane wasexposed to the film (XOMAT-Kodak) and kept at −80° C. overnight beforedeveloping.

Treatment with IL-6 caused an increase in INGAP gene expression (FIG.12). These data demonstrate that extracellular factors that elevateAP-1-binding transcription factors and STAT-binding transcriptionfactors are involved in the regulation of INGAP gene expression. Thesestudies suggest that it is feasible to enhance INGAP expression as ameans of inducing islet neogenesis.

While particular embodiments of the present invention have beenillustrated and described, it would be obvious to those skilled in theart that various other changes and modifications can be made withoutdeparting from the spirit and scope of the invention. It is thereforeintended to cover in the appended claims all such changes andmodifications that are within the scope of this invention.

TABLE 3 Position Core Matrix Family/matrix Further Information Opt.from-to anchor Str. sim. sim. Sequence V$LEFF/LEF1.01 TCF/LEF-1,involved in the 0.86 12-28 20 (+) 1.000 0.900 ggaccatCAAAgtctgt Wntsignal transduction pathway V$MITF/MIT.01 MIT (microphthalmia 0.81 22-4031 (+) 1.000 0.823 agtctgtCATGtcatttgg transcription factor) and TFE3V$OCT1/OCT1.05 octamer-binding factor 1 0.90 27-41 34 (+) 0.833 0.904gTCATgtcatttggg V$TCFF/TCF11.01 TCF11/KCR-F1/Nrf1 1.00 32-38 35 (+)1.000 1.000 GTCAttt homodimers V$MYOF/MYOGNF1.01 Myogenin/nuclear factor1 or 0.71 25-53 39 (+) 1.000 0.735 ctgtcatgtcatTTGGgggagggcctatg relatedfactors V$ZBPF/ZBP89.01 Zinc finger transcription factor 0.93 36-48 42(−) 1.000 0.982 gccctCCCCcaaa ZBP-89 V$SP1F/GC.01 GC box elements 0.8838-52 45 (+) 0.876 0.898 tgggGGAGggcctat V$PERO/PPARA.01 PPAR/RXRheterodimers 0.70 44-64 54 (−) 0.884 0.708 acagaggagggcATAGgccctV$PAX5/PAX9.01 zebrafish PAX9 binding sites 0.78 43-71 57 (−) 0.8000.811 cagataCACAgaggagggcataggccctc V$TBPF/ATATA.01 Avian C-type LTRTATA box 0.81 68-84 76 (−) 1.000 0.987 tgctattTAAGcccaga V$HMTB/MTBF.01muscle-specific Mt binding 0.90 76-84 80 (−) 1.000 0.932 tgctATTTa siteV$OCT1/OCT1.06 octamer-binding factor 1 0.80 74-88 81 (−) 0.750 0.865ggtatgctATTTaag V$BRNF/BRN2.01 POU factor Brn-2 (N-Oct 3) 0.91  89-10597 (+) 1.000 0.970 tccataggAAATgggct V$HMTB/MTBF.01 muscle-specific Mtbinding 0.90 108-116 112 (−) 1.000 0.953 tggaATTTg site V$OCT1/OCT1.05octamer-binding factor 1 0.90 106-120 113 (−) 0.944 0.917tATATggaatttggg V$HNF6/HNF6.01 Liver enriched Cut - 0.82 108-122 115 (+)0.833 0.885 caaatTCCAtatatg Homeodomain transcription factor HNF6(ONECUT) V$SRFF/SRF.02 serum response factor 0.83 110-128 119 (+) 1.0000.862 aattCCATatatgcactag V$OCTP/OCT1P.01 octamer-binding factor 1, 0.86114-126 120 (+) 1.000 0.903 ccatatATGCact POU-specific domainV$MYOF/MYOGNF1.01 Myogenin/nuclear factor 1 or 0.71 171-199 185 (+)0.857 0.740 ctggtcttttagCTGGcacccatccatat related factors V$NF1F/NF1.02Nuclear factor 1 (CTF1) 0.81 181-199 190 (+) 1.000 0.812agcTGGCacccatccatat V$CLOX/CDPCR3HD.01 cut-like homeodomain protein 0.94187-203 195 (−) 0.929 0.940 ctgaatatgGATGggtg V$MYOF/MYOGNF1.01Myogenin/nuclear factor 1 or 0.71 181-209 195 (−) 0.785 0.767aaccctctgaatATGGatgggtgccagct related factors V$OCTP/OCT1P.01octamer-binding factor 1, 0.86 192-204 198 (+) 0.980 0.907 atccatATTCagaPOU-specific domain V$CREB/TAXCREB.02 Tax/CREB complex 0.71 202-222 212(−) 0.750 0.721 ttgaacTGAAccaaaccctct V$HOXF/EN1.01 Homeobox proteinengrailed 0.77 210-226 218 (−) 0.782 0.823 aacaTTGAactgaacca (en-1)V$BARB/BARBIE.01 barbiturate-inducible element 0.88 230-244 237 (−)1.000 0.894 ttatAAAGctgagga V$TBPF/TATA.01 cellular and viral TATA box0.90 230-246 238 (−) 1.000 0.910 agttaTAAAgctgagga elementsV$BARB/BARBIE.01 barbiturate-inducible element 0.88 252-266 259 (−)1.000 0.902 agtgAAAGcagagag V$MYT1/MYT1.01 MyT1 zinc fingertranscription 0.75 272-284 278 (−) 0.750 0.756 craCAGTtgacct factorinvolved in primary neurogenesis V$SMAD/SMAD4.01 Smad4 transcriptionfactor 0.94 304-312 308 (+) 1.000 0.940 GTCTtgact involved in TGF-betasignaling V$HOXF/CRX.01 Cone-rod homeobox- 0.94 312-328 320 (−) 1.0000.960 gagggATTAgaaaagga containing transcription factor/ otx-likehomeobox gene V$ECAT/NFY.01 nuclear factor Y (Y-box 0.90 337-351 344 (−)1.000 0.906 ggaatCCAAtygtag binding factor) V$HOXF/PTX1.01 PituitaryHomeobox 1 (Ptx1) 0.79 337-353 345 (+) 0.789 0.802 ctacraTTGGattccatV$FKHD/FREAC2.01 Fork head RElated ACtivator-2 0.84 362-378 370 (−)1.000 0.897 tacagcTAAAcactgag V$MINI/MUSCLE_INI.02 Muscle InitiatorSequence 0.86 401-419 410 (−) 0.840 0.865 gagcctTCATccagtagctV$MOKF/MOK2.01 Ribonucleoprotein associated 0.74 409-429 419 (−) 1.0000.746 tgtcatcttagagCCTTcatc zinc finger protein MOK-2 (mouse)V$ZFIA/ZID.01 zinc finger with interaction 0.85 414-426 420 (+) 1.0000.861 agGCTCtaagatg domain V$CART/XVENT2.01 Xenopus homeodomain factor0.82 418-434 426 (+) 0.750 0.837 tcTAAGatgacaattaa Xvent-2; early BMPsignaling response V$OCT1/OCT1.04 octamer-binding factor 1 0.80 421-435428 (+) 0.807 0.840 aaGATGacaattaag V$HOMS/S8.01 Binding site for S8type 0.97 426-434 430 (+) 1.000 0.994 gacaATTAa homeodomainsV$NKXH/NKX25.02 homeo domain factor Nkx- 0.88 424-436 430 (−) 1.0001.000 cctTAATtgtcat 2.5/Csx, tinman homolog low affinity sitesV$CREB/CREBP1.01 cAMP-responsive element 0.80 425-445 435 (−) 0.7660.808 cgacgattACCTtaattgtca binding protein 1 V$COMP/COMP1.01 COMP1,cooperates with 0.76 434-454 444 (−) 0.750 0.768 aatgaggATCGacgattacctmyogenic proteins in multicomponent complex V$HOXF/HOX1-3.01 Hox-1.3,vertebrate 0.83 444-460 452 (+) 1.000 0.886 cgatcctcATTAtagtg homeoboxprotein V$ETSF/GABP.01 GABP: GA binding protein 0.85 454-470 462 (+)1.000 0.868 tatagtGGAAgggcttc V$LEFF/LEF1.01 TCF/LEF-1, involved in the0.86 463-479 471 (+) 1.000 0.904 agggcttCAAAggcagt Wnt signaltransduction pathway V$STAT/STAT6.01 STAT6: signal transducer and 0.84464-482 473 (−) 0.758 0.867 gagacTGCCtttgaagccc activator oftranscription 6 V$GATA/GATA1.03 GATA-binding factor 1 0.95 490-502 496(−) 1.000 0.971 ttcaGATAggcag V$SRFF/SRF.01 serum response factor 0.66487-505 496 (−) 0.757 0.672 atgttcaGATAggcagtag V$EVI1/EVI1.04 Ecotropicviral integration site 0.77 493-509 501 (−) 0.800 0.824gGAAAtgttcagatagg 1 encoded factor V$AP4R/TH1E47.01 Thing1/E47heterodimer, TH1 0.93 509-525 517 (+) 1.000 0.951 cctaatgCCAGatgtct bHLHmember specific expression in a variety of embryonic tissues V$AP4R/Tal-1beta/ITF-2 heterodimer 0.85 512-528 520 (+) 1.000 0.852aatgcCAGAtgtctctt TAL1BETAITF2.01 V$NEUR/NEUROD1.01 DNA binding site for0.83 514-526 520 (−) 1.000 0.851 gagaCATCtggca NEUROD1 (BETA-2/E47dimer) V$MEF2/MEF2.05 MEF2 0.96 518-540 529 (−) 1.000 0.984aggataggttTAAAgagacatct V$EVI1/EVI1.04 Ecotropic viral integration site0.77 523-539 531 (−) 1.000 0.774 gGATAggtttaaagaga 1 encoded factorV$MEF2/AMEF2.01 myocyte enhancer factor 0.80 521-543 532 (+) 1.000 0.813tgtctcttTAAAcctatcctggc V$TBPF/MTATA.01 Muscle TATA box 0.84 524-540 532(+) 1.000 0.877 ctcttTAAAcctatcct V$HOXF/HOX1-3.01 Hox-1.3, vertebrate0.83 543-559 551 (+) 1.000 0.845 ctcccttcATTAaggta homeobox proteinV$PDX1/ISL1.01 Pancreatic and intestinal lim- 0.82 543-563 553 (−) 1.0000.834 gagatacctTAATgaagggag homeodomain factor V$OCT1/OCT1.05octamer-binding factor 1 0.90 556-570 563 (+) 0.944 0.926gGTATctcatttttt V$CIZF/NMP4.01 NMP4 (nuclear matrix protein 0.97 562-572567 (−) 1.000 0.972 gcAAAAaatga 4)/CIZ (Cas-interacting zinc fingerprotein) V$EVI1/EVI1.01 Ecotropic viral integration site 0.72 569-585577 (−) 0.764 0.720 ggaaCAGAggagagcaa 1 encoded factor V$AP1F/AP1.01 AP1binding site 0.95 582-602 592 (−) 0.881 0.964 aaaactgaATCAgtggnggaaV$PIT1/PIT1.01 Pit1, GHF-1 pituitary specific 0.86 589-599 594 (+) 1.0000.886 actgATTCagt pou domain transcription factor V$AP1F/AP1.01 AP1binding site 0.95 586-606 596 (+) 0.850 0.956 nccactgaTTCAgtttttctgV$VMYB/VMYB.01 v-Myb 0.90 593-603 598 (−) 0.876 0.910 aaaAACTgaatV$CIZF/NMP4.01 NMP4 (nuclear matrix protein 0.97 595-605 600 (−) 1.0000.975 agAAAAactga 4)/CIZ (Cas-interacting zinc finger protein)V$GREF/PRE.01 Progesterone receptor binding 0.84 604-622 613 (+) 1.0000.875 ctgatccctctTGTTctcc site V$GKLF/GKLF.01 Gut-enriched Krueppel-like0.91 632-646 639 (−) 1.000 0.971 gaaaaagagaAGGGa factor V$CIZF/NMP4.01NMP4 (nuclear matrix protein 0.97 637-647 642 (−) 1.000 0.987ggAAAAagaga 4)/CIZ (Cas-interacting zinc finger protein) V$NFAT/NFAT.01Nuclear factor of activated T- 0.97 640-650 645 (−) 1.000 0.982ggagGAAAaag cells V$MAZF/MAZ.01 Myc associated zinc finger 0.90 649-661655 (−) 1.000 0.910 ggtgGAGGgaagg protein (MAZ) V$EGRF/WT1.01 WilmsTumor Suppressor 0.88 658-672 665 (−) 1.000 0.932 gggggTGGGagggtgV$ZBPF/ZBP89.01 Zinc finger transcription factor 0.93 663-675 669 (+)1.000 0.972 tcccaCCCCcatg ZBP-89 V$IRFF/IRF2.01 interferon regulatoryfactor 2 0.80 702-716 709 (−) 1.000 0.815 aggaagggGAAAggg V$BRNF/BRN2.01POU factor Brn-2 (N-Oct 3) 0.91 746-762 754 (−) 1.000 0.911aaaataggAAATaagga V$ETSF/PU1.01 Pu.1 (Pu120) Ets-like 0.86 746-762 754(−) 1.000 0.883 aaaataGGAAataagga transcription factor identified inlymphoid B-cells V$EVI1/EVI1.04 Ecotropic viral integration site 0.77750-766 758 (−) 0.760 0.792 aGAGAaaataggaaata 1 encoded factorV$EVI1/EVI1.05 Ecotropic viral integration site 0.80 755-771 763 (−)0.763 0.817 cccccagagaaAATAgg 1 encoded factor V$ZBPF/ZBP89.01 Zincfinger transcription factor 0.93 764-776 770 (−) 1.000 0.934ccacaCCCCcaga ZBP-89 V$FAST/FAST1.01 FAST-1 SMAD interacting 0.81769-783 776 (+) 0.983 0.894 gggtgtgGATTttat protein V$TBPF/TATA.02Mammalian C-type LTR TATA 0.89 771-787 779 (−) 1.000 0.942caccaTAAAatccacac box V$PAX5/PAX9.01 zebrafish PAX9 binding sites 0.78781-809 795 (−) 0.866 0.813 aacataTGCAcagaagggcttccaccata V$OCT1/OCT.01Octamer binding site 0.79 793-807 800 (−) 1.000 0.790 catATGCacagaagg(OCT1/OCT2 consensus) V$OCTP/OCT1P.01 octamer-binding factor 1, 0.86798-810 804 (−) 1.000 0.910 caacatATGCaca POU-specific domainV$SRFF/SRF.01 serum response factor 0.66 797-815 806 (+) 0.757 0.666ctgtgcaTATGttgtctta V$EVI1/EVI1.05 Ecotropic viral integration site 0.80802-818 810 (−) 0.750 0.828 caataagacaaCATAtg 1 encoded factorV$CLOX/CDP.01 cut-like homeodomain protein 0.75 803-819 811 (−) 1.0000.776 ccAATAagacaacatat V$EVI1/EVI1.02 Ecotropic viral integration site0.83 807-823 815 (−) 1.000 0.836 tcaaccaatAAGAcaac 1 encoded factorV$ECAT/NFY.02 nuclear factor Y (Y-box 0.91 810-824 817 (−) 1.000 0.960atcaaCCAAtaagac binding factor) V$HAML/AML3.01 Runt-relatedtranscription 0.84 811-825 818 (+) 1.000 0.844 tcttatTGGTtgata factor2/CBFA1 (core- binding factor, runt domain, alpha subunit 1)V$PCAT/CAAT.01 cellular and viral CCAAT box 0.90 813-823 818 (−) 1.0000.943 tcaaCCAAtaa V$GATA/GATA.01 GATA binding site 0.95 818-830 824 (+)1.000 0.956 ggttGATAaataa (consensus) V$HNF1/HNF1.02 Hepatic nuclearfactor 1 0.76 818-834 826 (+) 0.757 0.791 gGTTGataaataaagca V$HOXT/Homeobox protein MEIS1 0.79 823-835 829 (−) 0.750 0.797 gTGCTttatttatMEIS1_HOXA9.01 binding site V$ECAT/NFY.01 nuclear factor Y (Y-box 0.90837-851 844 (+) 1.000 0.912 gttgtCCAAtaggga binding factor)V$FKHD/FREAC2.01 Fork head RElated ACtivator-2 0.84 844-860 852 (+)0.750 0.843 aataggGAAAcaagata V$EVI1/EVI1.06 Ecotropic viral integrationsite 0.83 846-862 854 (+) 1.000 0.960 tagggaaacaAGATagg 1 encoded factorV$GATA/GATA1.01 GATA-binding factor 1 0.96 853-865 859 (+) 1.000 0.970acaaGATAggtgg V$PCAT/ACAAT.01 Avian C-type LTR CCAAT box 0.86 856-866861 (−) 0.750 0.867 cccaCCTAtct V$XBBF/RFX1.01 X-box binding proteinRFX1 0.89 909-927 918 (−) 1.000 0.929 ggatcacatgGCAAccctcV$EBOX/MYCMAX.02 c-Myc/Max heterodimer 0.92 912-928 920 (−) 0.895 0.936aggatCACAtggcaacc V$MITF/MIT.01 MIT (microphthalmia 0.81 911-929 920 (+)1.000 0.863 gggttgcCATGtgatccta transcription factor) and TFE3V$ETSF/PU1.01 Pu.1 (Pu120) Ets-like 0.86 927-943 935 (+) 1.000 0.950ctaggaGGAAttgacac transcription factor identified in lymphoid B-cellsV$OCT1/OCT1.06 octamer-binding factor 1 0.80 932-946 939 (−) 1.000 0.800catgtgtcAATTcct V$TALE/TGIF.01 TG-interacting factor 1.00 936-942 939(−) 1.000 1.000 tGTCAat belonging to TALE class of homeodomain factorsV$MITF/MIT.01 MIT (microphthalmia 0.81 935-953 944 (−) 1.000 0.835ccattctCATGtgtcaatt transcription factor) and TFE3 v$OCT1/OCT1.04octamer-binding factor 1 0.80 941-955 948 (+) 0.846 0.800caCATGagaatgggg V$GATA/GATA.01 GATA binding site 0.95 962-974 968 (+)1.000 0.998 gaaaGATAagtcc (consensus) V$SRFF/SRF.01 serum responsefactor 0.66 968-986 977 (−) 1.000 0.672 atattttTATAaggacttaV$CDXF/CDX2.01 Cdx-2 mammalian caudal 0.84 970-988 979 (−) 1.000 0.867atatattTTTAtaaggact related intestinal transcr. factor V$FKHD/XFD2.01Xenopus fork head domain 0.89 972-988 980 (+) 1.000 0.894tccttaTAAAaatatat factor 2 V$MEF2/MEF2.01 myogenic enhancer factor 20.74 970-992 981 (+) 1.000 0.740 agtccttaTAAAaatatatatta V$TBPF/TATA.01cellular and viral TATA box 0.90 973-989 981 (+) 1.000 0.963ccttaTAAAaatatata elements V$CART/CART1.01 Cart-1 (cartilage 0.84978-994 986 (−) 1.000 0.870 acTAATatatattttta homeoprotein 1)V$CART/CART1.01 Cart-1 (cartilage 0.84  985-1001 993 (−) 1.000 0.855caTAATtactaatatat homeoprotein 1) V$SATB/SATB1.01 Special AT-richsequence- 0.93  985-1001 993 (−) 1.000 0.943 cataattacTAATatat bindingprotein 1, predominantly expressed in thymocytes, binds to matrixattachment regions (MARs) V$BRNF/BRN3.01 POU transcription factor Brn-30.78  987-1003 995 (−) 1.000 0.816 cccATAAttactaatat V$CLOX/CDP.01cut-like homeodomain protein 0.75  987-1003 995 (−) 0.757 0.765ccCATAattactaatat V$HOMS/S8.01 Binding site for S8 type 0.97  992-1000996 (+) 1.000 0.989 agtaATTAt homeodomains V$NKXH/DLX1.01 DLX-1, −2, and−5 binding 0.91  990-1002 996 (−) 1.000 0.976 ccatAATTactaa sitesV$HOXF/HOX1-3.01 Hox-1.3, vertebrate 0.83  989-1005 997 (−) 1.000 0.886aacccataATTActaat homeobox protein V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1)pancreatic 0.74  988-1008 998 (−) 1.000 0.775 attaacccaTAATtactaata andintestinal homeodomain TF V$FKHD/XFD3.01 Xenopus fork head domain 0.82 998-1014 1006 (+) 0.826 0.844 tatgggttAATAattaa factor 3 V$HNF1/HNF1.01hepatic nuclear factor 1 0.78 1000-1016 1008 (−) 0.755 0.857aCTTAattattaaccca V$HNF1/HNF1.01 hepatic nuclear factor 1 0.78 1002-10181010 (+) 1.000 0.966 gGTTAataattaagtca V$PAX4/PAX4.01 Pax-4 paireddomain protein, 0.97 1005-1015 1010 (+) 1.000 0.972 taatAATTaag togetherwith PAX-6 involved in pancreatic development V$HOMS/S8.01 Binding sitefor S8 type 0.97 1007-1015 1011 (−) 1.000 0.995 cttaATTAt homeodomainsV$HOXF/HOX1-3.01 Hox-1.3, vertebrate 0.83 1003-1019 1011 (−) 1.000 0.873ctgacttaATTAttaac homeobox protein V$NKXH/DLX1.01 DLX-1, −2, and −5binding 0.91 1005-1017 1011 (+) 1.000 0.988 taatAATTaagtc sitesV$RBIT/BRIGHT.01 Bright, B cell regulator of IgH 0.92 1005-1017 1011 (+)1.000 0.931 taataATTAagtc transcription V$TBPF/ATATA.01 Avian C-type LTRTATA box 0.81 1005-1021 1013 (+) 1.000 0.881 taataatTAAGtcagagV$CREB/CREBP1.01 cAMP-responsive element 0.80 1004-1024 1014 (−) 0.7660.819 tagctctgACTTaattattaa binding protein 1 v$RORA/RORA2.01RAR-related orphan receptor 0.82 1007-1023 1015 (+) 0.750 0.874ataattaAGTCagagct alpha2 V$PCAT/CAAT.01 cellular and viral CCAAT box0.90 1022-1032 1027 (+) 0.856 0.928 ctagCCATtaa V$NKXH/NKX25.02 homeodomain factor Nkx- 0.88 1022-1034 1028 (−) 1.000 0.903 tctTAATggctag2.5/Csx, tinman homolog low affinity sites V$CREB/HLF.01 hepaticleukemia factor 0.84 1022-1042 1032 (−) 0.770 0.842ctagtGTTTcttaatggctag V$HOXF/HOX1-3.01 Hox-1.3, vertebrate 0.831056-1072 1064 (+) 1.000 0.891 gcttcataATTAatata homeobox proteinV$HOMS/S8.01 Binding site for S8 type 0.97 1061-1069 1065 (−) 1.0000.995 attaATTAt homeodomains V$NKXH/DLX1.01 DLX-1, −2, and −5 binding0.91 1059-1071 1065 (+) 1.000 0.988 tcatAATTaatat sites V$RBIT/BRIGHT.01Bright, B cell regulator of IgH 0.92 1059-1071 1065 (+) 1.000 0.952tcataATTAatat transcription V$BRNF/BRN2.01 POU factor Brn-2 (N-Oct 3)0.91 1058-1074 1066 (+) 1.000 0.945 ttcataatTAATatagt V$OCT1/OCT1.06octamer-binding factor 1 0.80 1060-1074 1067 (−) 1.000 0.885actatattAATTatg V$HOXF/HOX1-3.01 Hox-1.3, vertebrate 0.83 1061-1077 1069(−) 1.000 0.854 gatactatATTAattat homeobox protein V$OCT1/OCT1.06octamer-binding factor 1 0.80 1079-1093 1086 (+) 0.750 0.875tgtatgttCATTtgg V$FAST/FAST1.01 FAST-1 SMAD interacting 0.81 1080-10941087 (+) 0.850 0.887 gtatgttCATTtggg protein V$RREB/RREB1.01Ras-responsive element 0.79 1081-1095 1088 (−) 1.000 0.816cCCCAaatgaacata binding protein 1 V$E2FF/E2F.02 E2F, involved in cellcycle 0.84 1085-1099 1092 (−) 1.000 0.849 tcagcccCAAAtgaa regulation,interacts with Rb p107 protein V$CREB/TAXCREB.01 Tax/CREB complex 0.811091-1111 1101 (+) 1.000 0.828 tggggcTGACacagttctggg V$AP1F/VMAF.01v-Maf 0.82 1092-1112 1102 (+) 1.000 0.833 ggggcTGACacagttctgggaV$MYT1/MYT1.01 MyT1 zinc finger transcription 0.75 1123-1135 1129 (+)0.750 0.791 aggAAGAytactt factor involved in primary neurogenesisV$CLOX/CLOX.01 Clox 0.81 1136-1152 1144 (−) 0.804 0.820cctacaATCCatgtacc V$HNF4/HNF4.01 Hepatic nuclear factor 4 0.82 1156-11721164 (−) 1.000 0.864 atagagCAAAggactac V$LEFF/LEF1.01 TCF/LEF-1,involved in the 0.86 1157-1173 1165 (−) 1.000 0.907 catagagCAAAggactaWnt signal transduction pathway V$PERO/PPARA.01 PPAR/RXR heterodimers0.70 1157-1177 1167 (−) 1.000 0.700 tagacatagagcAAAGgacta V$CLOX/CLOX.01Clox 0.81 1173-1189 1181 (+) 0.804 0.831 gtctaaATCCatatatgV$HNF6/HNF6.01 Liver enriched Cut - 0.82 1175-1189 1182 (+) 0.833 0.929ctaaaTCCAtatatg Homeodomain transcription factor HNF6 (ONECUT)V$SRFF/SRF.02 serum response factor 0.83 1177-1195 1186 (+) 1.000 0.851aaatCCATatatgaatgag V$CLOX/CDPCR3.01 cut-like homeodomain protein 0.751180-1196 1188 (−) 1.000 0.761 actcattcatatATGGa V$PIT1/PIT1.01 Pit1,GHF-1 pituitary specific 0.86 1186-1196 1191 (−) 1.000 0.919 actcATTCatapou domain transcription factor V$HMTB/MTBF.01 muscle-specific Mtbinding 0.90 1196-1204 1200 (−) 0.807 0.901 tggtATGTa siteV$FKHD/HFH8.01 HNF-3/Fkh Homolog-8 0.92 1200-1216 1208 (−) 1.000 0.922gaaagayAAACatggta V$E4FF/E4F.01 GLI-Krueppel-related 0.82 1223-1235 1229(−) 0.789 0.898 gtgAGGTaacccc transcription factor, regulator ofadenovirus E4 promoter V$CREB/HLF.01 hepatic leukemia factor 0.841221-1241 1231 (+) 1.000 0.854 atgggGTTAcctcactcagga V$VBPF/VBP.01PAR-type chicken vitellogenin 0.86 1226-1236 1231 (+) 1.000 0.903gTTACctcact promoter-binding protein V$OCT1/OCT.01 Octamer binding site0.79 1259-1273 1266 (−) 0.758 0.870 cgcAGGCaaatgaat (OCT1/OCT2consensus) V$STAT/STAT6.01 STAT6: signal transducer and 0.84 1261-12791270 (+) 0.758 0.850 tcattTGCCtgcgaatttt activator of transcription 6V$CDXF/CDX2.01 Cdx-2 mammalian caudal 0.84 1270-1288 1279 (+) 1.0000.869 tgcgaatTTTAagattcca related intestinal transcr. factorV$SORY/SOX9.01 SOX (SRY-related HMG box) 0.90 1280-1296 1288 (−) 1.0000.990 taaaaCAATggaatctt V$FKHD/HFH2.01 HNF-3/Fkh Homolog 2 0.931285-1301 1293 (−) 1.000 0.931 aggaataaAACAatgga V$CDXF/CDX2.01 Cdx-2mammalian caudal 0.84 1286-1304 1295 (+) 1.000 0.865 ccattgtTTTAttcctctgrelated intestinal transcr. factor V$OCTB/TST1.01 POU-factor Tst-1/Oct-60.87 1288-1302 1295 (−) 0.894 0.876 gaggAATAaaacaat V$PDX1/ISL1.01Pancreatic and intestinal lim- 0.82 1298-1318 1308 (+) 1.000 0.824tcctctgagTAATactccatt homeodomain factor V$SORY/SOX9.01 SOX (SRY-relatedHMG box) 0.90 1308-1324 1316 (−) 1.000 0.925 ttacaCAATggagtattV$CREB/HLF.01 hepatic leukemia factor 0.84 1310-1330 1320 (−) 0.9010.920 ggtacATTAcacaatggagta V$VBPF/VBP.01 PAR-type chicken vitellogenin0.86 1315-1325 1320 (−) 1.000 0.871 aTTACacaatg promoter-binding proteinV$CEBP/CEBPB.01 CCAAT/enhancer binding 0.94 1313-1331 1322 (+) 0.9290.955 tccattgtGTAAtgtacca protein beta V$PDX1/ISL1.01 Pancreatic andintestinal lim- 0.82 1313-1333 1323 (+) 1.000 0.859tccattgtgTAATgtaccaca homeodomain factor V$HAML/AML1.01 runt-factorAML-1 1.00 1323-1337 1330 (−) 1.000 1.000 aaaatgTGGTacatt V$GREF/ARE.01Androgene receptor binding 0.80 1323-1341 1332 (+) 0.750 0.819aatgtaccacaTTTTctcc site V$TEAF/TEF1.01 TEF-1 related muscle factor 0.841343-1355 1349 (+) 1.000 0.896 taCATTcttcagt V$CMYB/CMYB.01 c-Myb,important in 0.99 1352-1360 1356 (+) 1.000 0.990 caGTTGagg hematopoesis,cellular equivalent to avian myoblastosis virus oncogene v-mybV$AP4R/TH1E47.01 Thing1/E47 heterodimer, TH1 0.93 1378-1394 1386 (−)1.000 0.932 gcaatagCCAGaacctg bHLH member specific expression in avariety of embryonic tissues V$CP2F/CP2.01 CP2 0.90 1384-1394 1389 (−)1.000 0.945 gcaatagCCAG V$CHOP/CHOP.01 heterodimers of CHOP and 0.901386-1398 1392 (−) 1.000 0.951 attTGCAatagcc C/EBPalpha V$CEBP/CEBP.02C/EBP binding site 0.85 1385-1403 1394 (+) 1.000 0.853tggctattGCAAataaccc V$MEF2/HMEF2.01 myocyte enhancer factor 0.761384-1406 1395 (+) 1.000 0.809 ctggctattgcAAATaaccctgc V$OCT1/OCT1.03octamer-binding factor 1 0.85 1388-1402 1395 (+) 1.000 0.889ctattgcAAATaacc V$HMTB/MTBF.01 muscle-specific Mt binding 0.90 1394-14021398 (−) 1.000 0.900 ggttATTTg site V$CLOX/CDPCR3.01 cut-likehomeodomain protein 0.75 1422-1438 1430 (+) 0.975 0.761acatatgtcattATTGt V$OCT1/OCT1.05 octamer-binding factor 1 0.90 1423-14371430 (+) 0.944 0.938 cATATgtcattattg V$HOXF/HOX1-3.01 Hox-1.3,vertebrate 0.83 1423-1439 1431 (+) 1.000 0.836 catatgtcATTAttgtahomeobox protein V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1) pancreatic 0.741423-1443 1433 (−) 1.000 0.889 ttcatacaaTAATgacatatg and intestinalhomeodomain TF V$SORY/SOX5.01 Sox-5 0.87 1426-1442 1434 (−) 1.000 0.870tcataCAATaatgacat V$OCT1/OCT1.05 octamer-binding factor 1 0.90 1444-14581451 (−) 0.944 0.914 aATATgtaaaacaga V$CREB/E4BP4.01 E4BP4, bZIP domain,0.80 1443-1463 1453 (−) 1.000 0.856 tttaaaatatGTAAaacagattranscriptional repressor V$VBPF/VBP.01 PAR-type chicken vitellogenin0.86 1449-1459 1454 (+) 1.000 0.886 tTTACatattt promoter-binding proteinV$TBPF/MTATA.01 Muscle TATA box 0.84 1455-1471 1463 (+) 1.000 0.841tatttTAAAccatctct V$PBXF/PBX1.01 homeo domain factor Pbx-1 0.781469-1481 1475 (−) 1.000 0.783 caagCAATctaga V$COMP/COMP1.01 COMP1,cooperates with 0.76 1467-1487 1477 (+) 1.000 0.765tctctagATTGcttgtaatat myogenic proteins in multicomponent complexV$SORY/SOX5.01 Sox-5 0.87 1478-1494 1486 (−) 1.000 0.997tttaaCAATattacaag V$FKHD/FREAC2.01 Fork head RElated ACtivator-2 0.841485-1501 1493 (+) 1.000 0.885 tattgtTAAAcatagag V$PDX1/ISL1.01Pancreatic and intestinal lim- 0.82 1495-1515 1505 (+) 1.000 0.839catagagagTAATaatgctat homeodomain factor V$HOXF/HOX1-3.01 Hox-1.3,vertebrate 0.83 1499-1515 1507 (−) 1.000 0.872 atagcattATTActctchomeobox protein V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1) pancreatic 0.741498-1518 1508 (−) 0.826 0.843 tttatagcaTTATtactctct and intestinalhomeodomain TF V$CART/XVENT2.01 Xenopus homeodomain factor 0.821502-1518 1510 (+) 1.000 0.829 agTAATaatgctataaa Xvent-2; early BMPsignaling response V$CDXF/CDX2.01 Cdx-2 mammalian caudal 0.84 1507-15251516 (−) 1.000 0.906 tttaattTTTAtagcatta related intestinal transcr.factor V$MEF2/MEF2.05 MEF2 0.96 1505-1527 1516 (+) 1.000 0.983aataatgctaTAAAaattaaaaa V$HNF1/HNF1.01 hepatic nuclear factor 1 0.781510-1526 1518 (−) 0.755 0.805 tTTTAatttttatagca V$OCT1/OCT1.06octamer-binding factor 1 0.80 1511-1525 1518 (+) 1.000 0.832gctataaaAATTaaa V$TBPF/TATA.02 Mammalian C-type LTR TATA 0.89 1510-15261518 (+) 1.000 0.991 tgctaTAAAaattaaaa box V$NKXH/MSX.01 Homeodomainproteins MSX- 0.97 1514-1526 1520 (−) 1.000 0.989 tttTAATttttat 1 andMSX-2 V$RBIT/BRIGHT.01 Bright, B cell regulator of IgH 0.92 1515-15271521 (+) 1.000 0.944 taaaaATTAaaaa transcription V$MEF2/AMEF2.01 myocyteenhancer factor 0.80 1514-1536 1525 (+) 1.000 0.807ataaaaatTAAAaataatgataa V$EVI1/EVI1.02 Ecotropic viral integration site0.83 1526-1542 1534 (+) 1.000 0.872 aataatgatAAGAaaga 1 encoded factorV$GATA/GATA1.02 GATA-binding factor 1 0.99 1528-1540 1534 (+) 1.0000.993 taatGATAagaaa V$GATA/GATA3.02 GATA-binding factor 3 0.91 1537-15491543 (+) 1.000 0.931 gaaAGATcctata V$GATA/GATA3.02 GATA-binding factor 30.91 1559-1571 1565 (+) 1.000 0.915 tacAGATgaaaat V$OCT1/OCT1.02octamer-binding factor 1 0.82 1561-1575 1568 (+) 0.763 0.867cagATGAaaatttag V$CEBP/CEBPB.01 CCAAT/enhancer binding 0.94 1567-15851576 (+) 0.985 0.964 aaaatttaGAAAtacttta protein beta V$PLZF/PLZF.01Promyelocytic leukemia zink 0.86 1574-1588 1581 (−) 0.958 0.866agcTAAAgtatttct finger (TF with nine Krueppel- like zink fingers)V$PAX3/PAX3.01 Pax-3 paired domain protein, 0.76 1587-1599 1593 (−)1.000 0.763 TCGTcagtggtag expressed in embryogenesis, mutationscorrelate to Waardenburg Syndrome V$CREB/ATF.01 activating transcriptionfactor 0.90 1588-1608 1598 (+) 1.000 0.923 taccacTGACgaaatttgtatV$AP4R/TH1E47.01 Thing1/E47 heterodimer, TH1 0.93 1614-1630 1622 (−)1.000 0.959 tttaattCCAGacattc bHLH member specific expression in avariety of embryonic tissues V$NKXH/MSX.01 Homeodomain proteins MSX-0.97 1619-1631 1625 (−) 1.000 0.977 cttTAATtccaga 1 and MSX-2V$RBIT/BRIGHT.01 Bright, B cell regulator of IgH 0.92 1620-1632 1626 (+)1.000 0.923 ctggaATTAaaga transcription V$OCTB/TST1.01 POU-factorTst-1/Oct-6 0.87 1620-1634 1627 (+) 1.000 0.898 ctggAATTaaagaaaV$NKXH/DLX3.01 Distal-less 3 homeodomain 0.91 1628-1640 1634 (−) 1.0000.915 cagTAATttcttt transcription factor V$GREF/PRE.01 Progesteronereceptor binding 0.84 1628-1646 1637 (+) 1.000 0.922 aaagaaattacTGTTctttsite V$TBPF/TATA.01 cellular and viral TATA box 0.90 1636-1652 1644 (−)1.000 0.934 ttataTAAAgaacagta elements V$FKHD/XFD2.01 Xenopus fork headdomain 0.89 1637-1653 1645 (−) 1.000 0.890 attataTAAAgaacagt factor 2V$TBPF/TATA.01 cellular and viral TATA box 0.90 1638-1654 1646 (−) 0.8910.923 tattaTATAaagaacag elements V$CREB/E4BP4.01 E4BP4, bZIP domain,0.80 1638-1658 1648 (−) 0.769 0.856 ctattattatATAAagaacagtranscriptional repressor V$PDX1/ISL1.01 Pancreatic and intestinal lim-0.82 1644-1664 1654 (+) 1.000 0.836 tttatataaTAATagactgta homeodomainfactor V$COMP/COMP1.01 COMP1, cooperates with 0.76 1648-1668 1658 (+)0.791 0.760 tataataATAGactgtaaaat myogenic proteins in multicomponentcomplex V$TBPF/TATA.02 Mammalian C-type LTR TATA 0.89 1658-1674 1666 (+)1.000 0.912 gactgTAAAatggcaac box V$IRFF/ISRE.01 interferon-stimulated0.81 1662-1676 1669 (+) 0.750 0.817 gtaaaatgGCAActt response elementV$XBBF/RFX1.01 X-box binding protein RFX1 0.89 1660-1678 1669 (+) 1.0000.907 ctgtaaaatgGCAActttt V$MYT1/MYT1.02 MyT1 zinc finger transcription0.88 1667-1679 1673 (−) 1.000 0.882 taaAAGTtgccat factor involved inprimary neurogenesis V$OCT1/OCT1.06 octamer-binding factor 1 0.801683-1697 1690 (+) 1.000 0.878 tatttgctAATTcac V$AP1F/TCF11MAFG.01TCF11/MafG heterodimers, 0.81 1681-1701 1691 (−) 0.777 0.865tcctgTGAAttagcaaatatt binding to subclass of AP1 sites V$NKXH/MSX2.01Muscle segment homeo box 0.95 1687-1699 1693 (+) 1.000 0.969tgCTAAttcacag 2, homologue of Drosophila (HOX 8) V$FAST/FAST1.01 FAST-1SMAD interacting 0.81 1687-1701 1694 (−) 0.850 0.866 tcctgtgAATTagcaprotein V$PBXC/ Binding site for a Pbx1/Meis1 0.76 1686-1702 1694 (+)0.750 0.788 ttgctaatTCACaggat PBX1_MEIS1.03 heterodimer V$CIZF/NMP4.01NMP4 (nuclear matrix protein 0.97 1699-1709 1704 (−) 1.000 0.973agAAAAaatcc 4)/CIZ (Cas-interacting zinc finger protein) V$STAT/STAT6.01STAT6: signal transducer and 0.84 1702-1720 1711 (−) 1.000 0.908agatgTTCCaaagaaaaaa activator of transcription 6 V$AP4R/ Tal-1beta/E47heterodimer 0.87 1710-1726 1718 (−) 1.000 0.919 ttgttCAGAtgttccaaTAL1BETAE47.01 V$SORY/HMGIY.01 HMGI(Y) high-mobility-group 0.921720-1736 1728 (+) 1.000 0.953 tgaacaAATTtccctta protein I (Y),architectural transcription factor organizing the framework of a nuclearprotein-DNA transcriptional complex V$MYT1/MYT1.01 MyT1 zinc fingertranscription 0.75 1723-1735 1729 (+) 0.750 0.757 acaAATTtccctt factorinvolved in primary neurogenesis V$SRFF/SRF.01 serum response factor0.66 1728-1746 1737 (+) 1.000 0.771 tttccctTATAtgaatcac V$HOXF/HOXA9.01Member of the vertebrate 0.87 1731-1747 1739 (−) 1.000 0.908agtGATTcatataaggg HOX - cluster of homeobox factors V$HOXT/ Homeoboxprotein MEIS1 0.79 1734-1746 1740 (−) 1.000 0.797 gTGATtcatataaMEIS1_HOXA9.01 binding site V$PIT1/PIT1.01 Pit1, GHF-1 pituitaryspecific 0.86 1737-1747 1742 (−) 1.000 0.912 agtgATTCata pou domaintranscription factor V$AP1F/AP1.01 AP1 binding site 0.95 1734-1754 1744(+) 0.881 0.958 ttatatgaATCActtacattt V$VBPF/VBP.01 PAR-type chickenvitellogenin 0.86 1746-1756 1751 (+) 1.000 0.860 cTTACatttttpromoter-binding protein V$FAST/FAST1.01 FAST-1 SMAD interacting 0.811757-1771 1764 (+) 0.850 0.829 gcctgttCATTtaaa protein V$HOXF/EN1.01Homeobox protein engrailed 0.77 1759-1775 1767 (−) 1.000 0.832gtttTTTAaatgaacag (en-1) V$TBPF/MTATA.01 Muscle TATA box 0.84 1763-17791771 (+) 1.000 0.853 tcattTAAAaaactgca V$ETSF/ETS2.01 c-Ets-2 bindingsite 0.86 1774-1790 1782 (+) 1.000 0.866 actgcAGGAaagttgtgV$MYT1/MYT1.02 MyT1 zinc finger transcription 0.88 1780-1792 1786 (+)1.000 0.891 ggaAAGTtgtgat factor involved in primary neurogenesisV$GFI1/GFI1.01 Growth factor independence 1 0.97 1782-1796 1789 (−)1.000 1.000 ataAATCacaacttt zinc finger protein acts as transcriptionalrepressor V$TBPF/TATA.01 cellular and viral TATA box 0.90 1784-1800 1792(−) 1.000 0.931 cattaTAAAtcacaact elements V$BRNF/BRN2.01 POU factorBrn-2 (N-Oct 3) 0.91 1786-1802 1794 (−) 1.000 0.933 tgcattatAAATcacaaV$HOXT/ Homeobox protein MEIS1 0.79 1788-1800 1794 (+) 1.000 0.924gTGATttataatg MEIS1_HOXA9.01 binding site V$MEF2/AMEF2.01 myocyteenhancer factor 0.80 1783-1805 1794 (−) 0.866 0.827agttgcatTATAaatcacaactt V$OCTB/TST1.01 POU-factor Tst-1/Oct-6 0.871787-1801 1794 (+) 0.894 0.898 tgtgATTTataatgc V$HOXF/HOXA9.01 Member ofthe vertebrate 0.87 1787-1803 1795 (+) 1.000 0.971 tgtGATTtataatgcaaHOX - cluster of homeobox factors V$BRNF/BRN2.01 POU factor Brn-2 (N-Oct3) 0.91 1788-1804 1796 (+) 1.000 0.916 gtgatttaTAATgcaac V$PARF/DBP.01Albumin D-box binding 0.84 1791-1805 1798 (+) 0.884 0.891atttaTAATgcaact protein V$OCT1/OCT1.02 octamer-binding factor 1 0.821795-1809 1802 (+) 1.000 0.861 ataATGCaactgcac V$FKHD/FREAC2.01 Forkhead RElated ACtivator-2 0.84 1816-1832 1824 (+) 1.000 0.910cagtctTAAAcaatgct V$SORY/SOX5.01 Sox-5 0.87 1821-1837 1829 (+) 1.0000.992 ttaaaCAATgctaacca V$AREB/AREB6.04 AREB6 (Atp1a1 regulatory 0.981837-1849 1843 (+) 1.000 0.981 actgtGTTTcagc element binding factor 6)V$MYT1/MYT1.02 MyT1 zinc finger transcription 0.88 1848-1860 1854 (−)1.000 0.889 gggAAGTttatgc factor involved in primary neurogenesisV$RBPF/RBPJK.01 Mammalian transcriptional 0.84 1851-1865 1858 (−) 1.0000.878 tgtgTGGGaagttta repressor RBP-Jkappa/CBF1 V$OCT1/OCT1.02octamer-binding factor 1 0.82 1875-1889 1882 (+) 0.763 0.826actATGAaaacacat V$FKHD/FREAC4.01 Fork head RElated ACtivator-4 0.781875-1891 1883 (+) 1.000 0.786 actatgaaAACAcatgc V$EBOX/MYCMAX.02c-Myc/Max heterodimer 0.92 1880-1896 1888 (+) 0.895 0.920gaaaaCACAtgcttaaa V$PAX6/PAX6.01 Pax-6 paired domain protein 0.751880-1898 1889 (−) 0.773 0.791 cctttAAGCatgtgttttc V$IRFF/IRF3.01Interferon regulatory factor 3 0.86 1891-1905 1898 (+) 1.000 0.874cttaaaggCAAAtct (IRF-3) V$HNF1/HNF1.02 Hepatic nuclear factor 1 0.761895-1911 1903 (−) 0.858 0.782 aGGTAaagatttgcctt V$FKHD/FREAC2.01 Forkhead RElated ACtivator-2 0.84 1898-1914 1906 (−) 1.000 0.853ctgaggTAAAgatttgc V$E4FF/E4F.01 GLI-Krueppel-related 0.82 1902-1914 1908(−) 0.789 0.830 ctgAGGTaaagat transcription factor, regulator ofadenovirus E4 promoter V$CREB/CREBP1.01 cAMP-responsive element 0.801900-1920 1910 (+) 0.766 0.820 aaatctttACCTcagttaact binding protein 1V$VBPF/VBP.01 PAR-type chicken vitellogenin 0.86 1905-1915 1910 (+)1.000 0.862 tTTACctcagt promoter-binding protein V$MYT1/MYT1.01 MyT1zinc finger transcription 0.75 1912-1924 1918 (−) 0.750 0.775gaaTAGTtaactg factor involved in primary neurogenesis V$HNF1/HNF1.01hepatic nuclear factor 1 0.78 1913-1929 1921 (+) 1.000 0.811aGTTAactattccatag V$PCAT/CAAT.01 cellular and viral CCAAT box 0.901928-1938 1933 (+) 0.856 0.925 agagCCATtga V$HNF6/HNF6.01 Liver enrichedCut - 0.82 1929-1943 1936 (−) 1.000 0.873 tgaacTCAAtggctc Homeodomaintranscription factor HNF6 (ONECUT) V$PXRF/PXRCAR.01 Halfsite of PXR(pregnane X 0.98 1935-1945 1940 (−) 1.000 0.980 ctTGAActcaareceptor)/RXR resp. CAR (constitutive androstane receptor)/RXRheterodimer binding site V$RARF/RTR.01 Retinoid receptor-related 0.811934-1952 1943 (+) 1.000 0.854 attgagtTCAAgtgcattt testis-associatedreceptor (GCNF/RTR) V$HOXF/EN1.01 Homeobox protein engrailed 0.771936-1952 1944 (+) 0.782 0.813 tgagTTCAagtgcattt (en-1) V$NKXH/NKX25.01homeo domain factor Nkx- 1.00 1939-1951 1945 (+) 1.000 1.000gttcAAGTgcatt 2.5/Csx, tinman homolog, high affinity sitesV$GATA/GATA3.02 GATA-binding factor 3 0.91 1953-1965 1959 (+) 1.0000.928 agaAGATataatg V$TBPF/TATA.01 cellular and viral TATA box 0.901968-1984 1976 (−) 0.891 0.912 atataTATAtggccata elements V$SRFF/SRF.01serum response factor 0.66 1969-1987 1978 (+) 1.000 0.777atggccaTATAtatatata V$CLOX/CDPCR3.01 cut-like homeodomain protein 0.751972-1988 1980 (−) 1.000 0.806 atatatatatatATGGc V$PAX1/PAX1.01 Pax1paired domain protein, 0.61 2016-2034 2025 (−) 0.750 0.675CTGTgctgatatatatata expressed in the developing vertebral column ofmouse embryos V$TBPF/ATATA.01 Avian C-type LTR TATA box 0.81 2019-20352027 (+) 0.750 0.827 atatataTCAGcacagt V$GFI1/GfI1B.01 Growth factorindependence 1 0.82 2021-2035 2028 (+) 1.000 0.904 ataTATCagcacagt zincfinger protein Gfi-1B V$NRSF/NRSF.01 neuron-restrictive silencer 0.692025-2045 2035 (+) 1.000 0.704 atcAGCAcagtggaaacagtt factorV$NFAT/NFAT.01 Nuclear factor of activated T- 0.97 2033-2043 2038 (+)1.000 0.970 agtgGAAAcag cells V$AREB/AREB6.04 AREB6 (Atp1a1 regulatory0.98 2034-2046 2040 (−) 1.000 0.991 taactGTTTccac element binding factor6) V$HNF1/HNF1.01 hepatic nuclear factor 1 0.78 2036-2052 2044 (−) 1.0000.798 tGTTAttaactgtttcc V$FKHD/XFD3.01 Xenopus fork head domain 0.822038-2054 2046 (+) 0.826 0.824 aaacagttAATAacatt factor 3 V$PDX1/PDX1.01Pdx1 (IDX1/IPF1) pancreatic 0.74 2036-2056 2046 (+) 1.000 0.749ggaaacagtTAATaacatttt and intestinal homeodomain TF V$OCT1/OCT1.01octamer-binding factor 1 0.77 2050-2064 2057 (−) 1.000 0.863taTATGctaaaatgt V$TBPF/TATA.01 cellular and viral TATA box 0.902053-2069 2061 (−) 0.891 0.908 tagtaTATAtgctaaaa elements V$ETSF/GABP.01GABP: GA binding protein 0.85 2080-2096 2088 (+) 1.000 0.897gaggctGGAAgggggct V$BEL1/BEL1.01 Bel-1 similar region (defined 0.782083-2105 2094 (+) 1.000 0.787 gctggaagggggcTCAGcagtta in LentivirusLTRs) V$VMYB/VMYB.01 v-Myb 0.90 2097-2107 2102 (−) 0.876 0.901attAACTgctg V$GREF/ARE.01 Androgene receptor binding 0.80 2106-2124 2115(+) 0.750 0.840 atagcacatacTATTcttc site V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1)pancreatic 0.74 2137-2157 2147 (+) 0.782 0.747 gtttggtttTCATcacccatg andintestinal homeodomain TF V$MYOD/MYOD.02 myoblast determining factor0.98 2154-2168 2161 (−) 1.000 0.988 gaacCACCtgacatg V$GATA/GATA1.03GATA-binding factor 1 0.95 2169-2181 2175 (−) 1.000 0.958 tacaGATAgaaatV$AP4R/ Tal-1beta/E47 heterodimer 0.87 2179-2195 2187 (+) 1.000 0.924gtaacCAGAtgatacga TAL1BETAE47.01 V$OAZF/ROAZ.01 Rat C2H2 Zn fingerprotein 0.73 2204-2220 2212 (−) 0.750 0.762 agGTACccaaggggact involvedin olfactory neuronal differentiation V$GATA/GATA1.01 GATA-bindingfactor 1 0.96 2217-2229 2223 (−) 1.000 0.960 aggtGATAgaggt V$MYOD/E47.02TAL1/E47 dimers 0.93 2220-2234 2227 (−) 1.000 0.939 atagCAGGtgatagaV$LTUP/TAACC.01 Lentiviral TATA upstream 0.71 2225-2247 2236 (+) 0.7590.710 cacctgctattctCACCaaaga element V$RREB/RREB1.01 Ras-responsiveelement 0.79 2239-2253 2246 (+) 1.000 0.805 aCCCAaagacacaca bindingprotein 1 V$OCT1/OCT1.05 octamer-binding factor 1 0.90 2251-2265 2258(−) 0.944 0.904 tGTATgtgagtgtgt V$OCT1/OCT1.02 octamer-binding factor 10.82 2282-2296 2289 (+) 1.000 0.854 tgcATGCacatagtt V$COUP/COUP.01 COUPantagonizes HNF-4 by 0.81 2284-2298 2291 (−) 0.977 0.855 tGAACtatgtgcatgbinding site competition or synergizes by direct protein — proteininteraction with HNF-4 V$MEF2/MEF2.01 myogenic enhancer factor 2 0.742290-2312 2301 (+) 0.750 0.767 catagttcAAAAaataaaatttt V$CDXF/CDX2.01Cdx-2 mammalian caudal 0.84 2296-2314 2305 (−) 1.000 0.896ttaaaatTTTAttttttga related intestinal transcr. factor V$MYT1/MYT1.01MyT1 zinc finger transcription 0.75 2301-2313 2307 (−) 0.750 0.798taaAATTttattt factor involved in primary neurogenesis V$NFAT/NFAT.01Nuclear factor of activated T- 0.97 2314-2324 2319 (+) 1.000 0.991aaagGAAAaaa cells V$CIZF/NMP4.01 NMP4 (nuclear matrix protein 0.972317-2327 2322 (+) 1.000 0.977 ggAAAAaaagc 4)/CIZ (Cas-interacting zincfinger protein) V$GATA/GATA3.02 GATA-binding factor 3 0.91 2326-23382332 (−) 1.000 0.946 aaaAGATttgagc V$HMTB/MTBF.01 muscle-specific Mtbinding 0.90 2351-2359 2355 (−) 1.000 0.901 aggaATTTt siteV$NOLF/OLF1.01 olfactory neuron-specific 0.82 2350-2372 2361 (+) 0.8060.820 taaaatTCCTatgagtgtgtgat factor V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1)pancreatic 0.74 2363-2383 2373 (−) 0.782 0.753 tactgacttTGATcacacact andintestinal homeodomain TF V$GATA/GATA3.02 GATA-binding factor 3 0.912395-2407 2401 (−) 1.000 0.942 cacAGATtatacc V$NFAT/NFAT.01 Nuclearfactor of activated T- 0.97 2406-2416 2411 (+) 1.000 0.971 tgtgGAAAacacells V$OCTP/OCT1P.01 octamer-binding factor 1, 0.86 2433-2445 2439 (+)0.980 0.879 ctcagtATTCaca POU-specific domain V$MITF/MIT.01 MIT(microphthalmia 0.81 2438-2456 2447 (−) 1.000 0.827 ctactttCATGtgtgaatatranscription factor) and TFE3 V$PAX8/PAX8.01 PAX 2/5/8 binding site0.88 2441-2453 2447 (−) 0.850 0.952 cttTCATgtgtga V$TBPF/ATATA.01 AvianC-type LTR TATA box 0.81 2451-2467 2459 (+) 1.000 0.838aagtagcTAAGaataaa V$GATA/GATA3.02 GATA-binding factor 3 0.91 2462-24742468 (−) 1.000 0.960 aatAGATtttatt V/$CLOX/CLOX.01 Clox 0.81 2462-24782470 (+) 0.806 0.819 aataaaATCTattcatc V$HNF6/HNF6.01 Liver enriched Cut— 0.82 2464-2478 2471 (+) 0.785 0.846 taaaaTCTAttcatc Homeodomaintranscription factor HNF6 (ONECUT) V$PIT1/PIT1.01 Pit1, GHF-1 pituitaryspecific 0.86 2468-2478 2473 (+) 1.000 0.890 atctATTCatc pou domaintranscription factor V$AP4R/ Tal-1beta/ITF-2 heterodimer 0.85 2469-24852477 (−) 1.000 0.881 aaaaaCAGAtgaataga TAL1BETAITF2.01 V$CIZF/NMP4.01NMP4 (nuclear matrix protein 0.97 2477-2487 2482 (−) 1.000 0.981ggAAAAacaga 4)/CIZ (Cas-interacting zinc finger protein) V$NFAT/NFAT.01Nuclear factor of activated T- 0.97 2480-2490 2485 (−) 1.000 0.976taagGAAAaac cells V$STAT/STAT.01 signal transducers and 0.87 2479-24972488 (−) 1.000 0.872 aggattttaaGGAAaaaca activators of transcriptionV$TBPF/TATA.02 Mammalian C-type LTR TATA 0.89 2484-2500 2492 (+) 1.0000.897 actgagtcAACActgta box V$FKHD/XFD3.01 Xenopus fork head domain 0.822501-2517 2509 (−) 1.000 0.880 actgagtcAACActgta factor 3 V$AP1F/AP1.01AP1 binding site 0.95 2500-2520 2510 (−) 1.000 0.984accactgaGTCAacactgtag V$AP1F/AP1.01 AP1 binding site 0.95 2504-2524 2514(+) 0.964 0.984 agtgttgaCTCAgtggttgct V$PCAT/CAAT.01 cellular and viralCCAAT box 0.90 2513-2523 2518 (−) 0.826 0.904 gcaaCCACtga V$CDXF/CDX2.01Cdx-2 mammalian caudal 0.84 2524-2542 2533 (+) 1.000 0.883tttaaatTTTAtgctcaaa related intestinal transcr. factor V$MYT1/MYT1.02MyT1 zinc finger transcriotion 0.88 2539-2551 2545 (+) 1.000 0.891caaAAGTtgaagc factor involved in primary neurogenesis V$ETSF/FLI.01 ETSfamily member FLI 0.81 2560-2576 2568 (+) 1.000 0.829 tgaaCCGGtaattctacV$MYT1/MYT1.01 MyT1 zinc finger transcription 0.75 2569-2581 2575 (−)1.000 0.757 acaAAGTagaatt factor involved in primary neurogenesisV$TBPF/ATATA.01 Avian C-type LTR TATA box 0.81 2576-2592 2584 (−) 0.7500.816 aagtattTAATacaaag V$SATB/SATB1.01 Special AT-rich sequence- 0.932578-2594 2586 (−) 1.000 0.939 acaagtattTAATacaa binding protein 1,predominantly expressed in thymocytes, binds to matrix attachmentregions (MARS) V$NKXH/NKX31.01 prostate-specific 0.84 2584-2596 2590 (−)1.000 0.865 taacAAGTattta homeodomain protein NKX3.1 V$PARF/DBP.01Albumin D-box binding 0.84 2589-2603 2596 (+) 1.000 0.882acttgTTATgcatcg protein V$PAX5/PAX5.02 B-cell-specific activating 0.752591-2619 2605 (−) 1.000 0.758 aacttgatttgttgAGCGatgcataacaa proteinV$ECAT/NFY.03 nuclear factor Y (Y-box 0.80 2604-2618 2611 (+) 0.7500.809 ctcaaCAAAtcaagt binding factor) V$GFI1/GFI1.01 Growth factorindependence 1 0.97 2608-2622 2615 (+) 1.000 0.976 acaAATCaagtttta zincfinger protein acts as transcriptional repressor V$HNF6/HNF6.01 Liverenriched Cut — 0.82 2608-2622 2615 (+) 1.000 0.830 acaaaTCAAgttttaHomeodomain transcription factor HNF6 (ONECUT) V$MYT1/MYT1.01 MyT1 zincfinger transcription 0.75 2610-2622 2616 (−) 0.750 0.756 taaAACTtgatttfactor involved in primary neurogenesis V$PAX8/PAX8.01 PAX 2/5/8 bindingsite 0.88 2610-2622 2616 (+) 1.000 0.907 aaaTCAAgtttta V$TTFF/TTF1.01Thyroid transcription factor-1 0.92 2609-2623 2616 (+) 1.000 0.936caaatCAAGttttaa (TTF1) binding site V$MYT1/MYT1.02 MyT1 zinc fingertranscription 0.88 2612-2624 2618 (+) 1.000 0.887 atcAAGTtttaac factorinvolved in primary neurogenesis V$CDXF/CDX2.01 Cdx-2 mammalian caudal0.84 2612-2630 2621 (+) 1.000 0.883 atcaagtTTTAacacacca relatedintestinal transcr. factor V$SORY/HMGIY.01 HMGI(Y) high-mobility-group0.92 2649-2665 2657 (−) 1.000 0.925 ttaaaaAATTtaagata protein I (Y),architectural transcription factor organizing the framework of a nuclearprotein-DNA transcriptional complex V$HOXF/EN1.01 Homeobox proteinengrailed 0.77 2657-2673 2665 (+) 1.000 0.780 atttTTTAaatgggcat (en-1)V$OCT1/OCT1.06 octamer-binding factor 1 0.80 2662-2676 2669 (−) 0.7500.818 tttatgccCATTtaa V$BCL6/BCL6.01 POZ/zinc finger protein, 0.762683-2699 2691 (+) 1.000 0.796 ctaTTCCtacagaagtc transcriptionalrepressor, translocations observed in diffuse large cell lymphomaV$OCTP/OCT1P.01 octamer-binding factor 1, 0.86 2715-2727 2721 (+) 1.0000.860 ctgaaaATGCatt POU-specific domain V$TEAF/TEF1.01 TEF-1 relatedmuscle factor 0.84 2722-2734 2728 (+) 1.000 0.898 tgCATTcctgattV$GFI1/GFI1.01 Growth factor independence 1 0.97 2723-2737 2730 (−)1.000 0.981 ataAATCaggaatgc zinc finger protein acts as transcriptionalrepressor V$HOXT/ Homeobox protein MEIS1 0.79 2729-2741 2735 (+) 1.0000.929 cTGATttatgtaa MEIS1_HOXA9.01 binding site V$HOXF/HOXA9.01 Memberof the vertebrate 0.87 2728-2744 2736 (+) 1.000 0.964 cctGATTtatgtaaataHOX - cluster of homeobox factors V$PARF/DBP.01 Albumin D-box binding0.84 2729-2743 2736 (+) 1.000 0.861 ctgatTTATgtaaat proteinV$VBPF/VBP.01 PAR-type chicken vitellogenin 0.86 2732-2742 2737 (−)1.000 0.929 tTTACataaat promoter-binding protein V$CREB/E4BP4.01 E4BP4,bZIP domain, 0.80 2728-2748 2738 (+) 1.000 0.943 cctgatttatGTAAatatatgtranscriptional repressor V$OCT1/OCT1.01 octamer-binding factor 1 0.772733-2747 2740 (+) 1.000 0.895 ttTATGtaaatatat V$FKHD/XFD1.01 Xenopusfork head domain 0.90 2733-2749 2741 (+) 1.000 0.940 tttatgTAAAtatatgtfactor 1 V$SRFF/SRF.01 serum response factor 0.66 2736-2754 2745 (+)1.000 0.691 atgtaaaTATAtgtatata V$OCTP/OCT1P.01 octamer-binding factor1, 0.86 2746-2758 2752 (+) 0.849 0.883 atgtatATACata POU-speciflc domainV$CLOX/CDPCR3.01 cut-like homeodomain protein 0.75 2748-2764 2756 (+)0.888 0.755 gtatatacatatATAGc V$TBPF/TATA.01 cellular and viral TATA box0.90 2749-2765 2757 (−) 0.891 0.903 ggctaTATAtgtatata elementsV$SRFF/SRF.01 serum response factor 0.66 2750-2768 2759 (+) 1.000 0.709atatacaTATAtagcctta V$TBPF/ATATA.01 Avian C-type LTR TATA box 0.812759-2775 2767 (−) 1.000 0.816 ttgttttTAAGgctata V$TBPF/TATA.02Mammalian C-type LTR TATA 0.89 2762-2778 2770 (+) 1.000 0.899agcctTAAAaacaaaga box V$CABL/CABL.01 Multifunctional c-Abl src type 0.972769-2779 2774 (+) 1.000 0.973 aaAACAaagat tyrosine kinaseV$LEFF/LEF1.01 TCF/LEF-1, involved in the 0.86 2766-2782 2774 (+) 1.0000.863 ttaaaaaCAAAgattgt Wnt signal transduction pathway V$OCT1/OCT1.06octamer-binding factor 1 0.80 2775-2789 2782 (+) 1.000 0.811aagattgtAATTttt V$MEF2/MMEF2.01 myocyte enhancer factor 0.90 2776-27982787 (−) 1.000 0.900 acaatttaTAAAaattacaatct V$OCT1/OCT1.06octamer-binding factor 1 0.80 2780-2794 2787 (−) 1.000 0.844tttataaaAATTaca V$TBPF/TATA.01 cellular and viral TATA box 0.902779-2795 2787 (−) 1.000 0.956 atttaTAAAaattacaa elementsV$CART/CART1.01 Cart-1 (cartilage 0.84 2780-2796 2788 (+) 1.000 0.875tgTAATttttataaatt homeoprotein 1) V$FKHD/XFD2.01 Xenopus fork headdomain 0.89 2780-2796 2788 (−) 1.000 0.903 aatttaTAAAaattaca factor 2V$MEF2/MEF2.05 MEF2 0.96 2778-2800 2789 (−) 1.000 0.973tcacaatttaTAAAaattacaat V$BRNF/BRN3.01 POU transcription factor Bm-30.78 2785-2801 2793 (−) 0.750 0.798 atcACAAtttataaaaa V$TBPF/TATA.01cellular and viral TATA box 0.90 2786-2802 2794 (+) 1.000 0.927ttttaTAAAttgtgatt elements V$GFI1/GFI1.01 Growth factor independence 10.97 2791-2805 2798 (−) 1.000 0.997 aaaAATCacaattta zinc finger proteinacts as transcriptional repressor V$HOXT/ Homeobox protein MEIS1 0.792797-2809 2803 (+) 1.000 0.806 gTGATttttaaaa MEIS1_HOXA9.01 binding siteV$MEF2/MMEF2.01 myocyte enhancer factor 0.90 2792-2814 2803 (−) 1.0000.923 tattttttTAAAaatcacaattt V$MEF2/MEF2.05 MEF2 0.96 2795-2817 2806(+) 1.000 0.990 ttgtgattttTAAAaaaataaac V$MEF2/MMEF2.01 myocyte enhancerfactor 0.90 2797-2819 2808 (+) 1.000 0.905 gtgattttTAAAaaaataaacctV$HNF1/HNF1.01 hepatic nuclear factor 1 0.78 2802-2818 2810 (−) 0.7550.796 gGTTTatttttttaaaa V$MEF2/MEF2.01 myogenic enhancer factor 2 0.742799-2821 2810 (+) 0.750 0.775 gatttttaAAAAaataaacctgc V$HOXF/HOX1-3.01Hox-1.3, vertebrate 0.83 2814-2830 2822 (+) 1.000 0.848aaacctgcATTAtcttc homeobox protein V$PARF/DBP.01 Albumin D-box binding0.84 2816-2830 2823 (−) 0.884 0.851 gaagaTAATgcaggt proteinV$PDX1/ISL1.01 Pancreatic and intestinal lim- 0.82 2814-2834 2824 (−)1.000 0.853 tgctgaagaTAATgcaggttt homeodomain factor V$GATA/GATA1.02GATA-binding factor 1 0.99 2819-2831 2825 (−) 1.000 0.993 tgaaGATAatgcaV$HEAT/HSF1.01 heat shock factor 1 0.93 2845-2855 2850 (+) 0.867 0.951TGAAtgttcct V$MYT1/MYT1.02 MyT1 zinc finger transcription 0.88 2853-28652859 (+) 1.000 0.893 cctAAGTtttgta factor involved in primaryneurogenesis V$BCL6/BCL6.02 POZ/zinc finger protein, 0.77 2857-2873 2865(+) 1.000 0.772 agttttgTAGAacttga transcriptional repressor,translocations observed in diffuse large cell lymphoma V$TTFF/TTF1.01Thyroid transcription factor-1 0.92 2863-2877 2870 (−) 1.000 0.927cgtgtCAAGttctac (TTF1) binding site V$EBOX/USF.02 upstream stimulatingfactor 0.94 2868-2884 2876 (−) 1.000 0.997 tctgccaCGTGtcaagtV$HOXF/PTX1.01 Pituitary Homeobox 1 (Ptx1) 0.79 2892-2908 2900 (+) 1.0000.795 aggattTTAGtctacac V$MYOD/LMO2COM.01 complex of Lmo2 bound to 0.982901-2915 2908 (−) 1.000 0.981 gatgCAGGtgtagac Tal-1, E2A proteins, andGATA-1, half-site 1 V$REBV/EBVR.01 Epstein-Barr virus 0.81 2904-29242914 (−) 1.000 0.832 ctgtcctcagatgcaGGTGta transcription factor RV$ETSF/PU1.01 Pu.1 (Pu120) Ets-like 0.86 2932-2948 2940 (+) 1.000 0.873ctaacaGGAAaggagac transcription factor identified in lymphoid B-cellsV$MITF/MIT.01 MIT (microphthalmia 0.81 2943-2961 2952 (+) 1.000 0.829ggagacaCATGtgtggtag transcription factor) and TFE3 V$HAML/AML1.01runt-factor AML-1 1.00 2950-2964 2957 (+) 1.000 1.000 catgtgTGGTagttcV$NFKB/CREL.01 c-Rel 0.91 2954-2968 2961 (+) 1.000 0.919 tgtggtagTTCCcagV$IKRS/IK3.01 Ikaros 3, potential regulator 0.84 2958-2970 2964 (−)1.000 0.841 aactgGGAActac of lymphocyte differentiation V$RBPF/RBPJK.01Mammalian transcriptional 0.84 2957-2971 2964 (−) 1.000 0.842aaacTGGGaactacc repressor RBP-Jkappa/CBF1 V$E2FF/E2F.01 E2F, involved incell cycle 0.74 2966-2980 2973 (−) 0.750 0.784 ttcacgtCAAAactgregulation, interacts with Rb p107 protein V$E4FF/E4F.01GLI-Krueppel-related 0.82 2968-2980 2974 (−) 1.000 0.830 ttcAcGTcaaaactranscription factor, regulator of adenovirus E4 promoter V$CREB/ATF6.02Activating transcription factor 0.85 2966-2986 2976 (+) 1.000 0.985cagttttGACGtgaaaagtcc 6, member of b-zip family, induced by ER stressV$EBOX/ARNT.01 AhR nuclear translocator 0.89 2968-2984 2976 (+) 1.0000.891 gttttgaCGTGaaaagt homodimers V$E4FF/E4F.01 GLI-Krueppel-related0.82 2971-2983 2977 (+) 1.000 0.909 ttgACGTgaaaag transcription factor,regulator of adenovirus E4 promoter V$EBOR/XBP1.01 X-box-binding protein1 0.86 2970-2984 2977 (+) 1.000 0.890 tttgACGTgaaaagt V$E2FF/E2F.01 E2F,involved in cell cycle 0.74 2971-2985 2978 (+) 1.000 0.837ttgacgtGAAAagtc regulation, interacts with Rb p107 proteinV$STAT/STAT.01 signal transducers and 0.87 2989-3007 2998 (+) 1.0000.937 cattcttactGGAAacctc activators of transcription V$BCL6/BCL6.02POZ/zinc finger protein, 0.77 2991-3007 2999 (+) 0.800 0.805ttcttacTGGAaacctc transcriptional repressor, translocations observed indiffuse large cell lymphoma V$XSEC/STAF.01 Se-Cys tRNA gene 0.773003-3025 3014 (+) 0.782 0.791 acctCCCTgaatccatgccaagc transcriptionactivating factor V$NF1F/NF1.01 Nuclear factor 1 0.94 3007-3025 3016 (−)1.000 0.964 gctTGGCatggattcaggg V$OCT1/OCT1.02 octamer-binding factor 10.82 3014-3028 3021 (+) 1.000 0.820 tccATGCcaagcact V$RCAT/ MammalianC-type LTR 0.75 3019-3043 3031 (+) 1.000 0.787 gCCAAgcactacccatcaccttgacCLTR_CAAT.01 CCAAT box V$SF1F/SF1.01 SF1 steroidogenic factor 1 0.953033-3045 3039 (−) 1.000 0.954 cagtCAAGgtgat V$OCT1/OCT1.01octamer-binding factor 1 0.77 3038-3052 3045 (−) 1.000 0.800ctTATGccagtcaag V$PARF/DBP.01 Albumin D-box binding 0.84 3042-3056 3049(−) 1.000 0.862 agtgcTTATgccagt protein V$ETSF/ETS1.01 c-Ets-1 bindingsite 0.92 3057-3073 3065 (−) 1.000 0.920 atcaaAGGAaatgagtgV$LEFF/LEF1.01 TCF/LEF-1, involved in the 0.86 3062-3078 3070 (−) 1.0000.969 ggggcatCAAAggaaat Wnt signal transduction pathway V$MAZF/MAZ.01Myc associated zinc finger 0.90 3072-3084 3078 (−) 1.000 0.912gaggGAGGggcat protein (MAZ) V$SP1F/GC.01 GC box elements 0.88 3071-30853078 (−) 0.876 0.920 tgagGGAGgggcatc V$TBPF/TATA.01 cellular and viralTATA box 0.90 3091-3107 3099 (+) 1.000 0.973 tattaTAAAagcacagt elementsV$SEF1/SEF1.01 SEF1 binding site 0.69 3099-3117 3108 (−) 1.000 0.700gaaagagacgaCTGTgctt

1. A method for identifying agents which modulate INGAP expression,comprising: contacting a host cell comprising a reporter constructhaving at least one INGAP regulatory region and a region encoding adetectable product with a test agent; determining expression of thedetectable product in the cell; and identifying the test agent as amodulator of INGAP expression if the test agent modulates expression ofthe detectable product in the cell.
 2. The method of claim 1 wherein theregulatory region nucleotide sequence comprises of one or more regionschosen from nucleotides from the group consisting of SEQ ID NO: 1, 2,23, 32, 35, 37, 28, 24, 25, 26, 27, 29, 30, 31, 33, 34, 38, and
 36. 3.The method of claim 6 wherein the regulatory sequence comprisesnucleotide sequences 1-3137 of SEQ ID NO: 2
 4. The method of claim 2wherein the reporter construct, further comprises: a promoter elementinterposed between the regulatory region nucleotide sequence and thenucleotide sequence encoding the detectable product.
 5. The method ofclaim 4 wherein the promoter element is selected from SEQ ID NO:
 2. 6.An in vitro method for identifying agents which modulate INGAPexpression, comprising: a. contacting a reporter construct having atleast one INGAP regulatory region and a region encoding a detectableproduct with a test substance under conditions sufficient fortranscription and translation of said nucleotide sequence; determiningexpression of the detectable protein or nucleic acid product; andidentifying the test substance as a modulator of INGAP expression if thetest substance modulates expression of the detectable product.
 7. Themethod of claim 6 wherein the regulatory region nucleotide sequencecomprises of one or more regions chosen from nucleotides from the groupconsisting of SEQ ID NO: 1, 2, 23, 32, 35, 37, 28, 24, 25, 26, 27, 29,30, 31, 33, 34, 38, and
 36. 8. The method of claim 7 wherein theregulatory sequence comprises nucleotide sequences 1-3137 of SEQ ID NO:2
 9. The method of claim 6 wherein the reporter construct, furthercomprises: a promoter element interposed between the regulatory regionnucleotide sequence and the nucleotide sequence encoding the detectableproduct.
 10. The reporter construct of claim 9 wherein the promoterelement is selected from SEQ ID NO:
 2. 11. An in vitro method foridentifying agents which modulate INGAP expression, comprising:contacting the SEQ ID NOS 1, 2, 23, 32, 35, 37, 28, 24, 25, 26, 27, 29,30, 31, 33, 34, 38, and 36 or fragments thereof with a test agent;determining binding of the test agent to the nucleic acid; andidentifying the test agent as a potential modulator of INGAP expressionif the test agent binds to the nucleic acid.
 12. The method of claim 11wherein the sequence comprises nucleotide sequences 1-3137 of SEQ ID NO:2
 13. A method for inducing INGAP expression in a mammal in needthereof, comprising administering to the mammal an effective amount of afactor that stimulates INGAP expression in the said mammal.
 14. Themethod of claim 13 wherein the factor that stimulates INGAP expressionwas identified by the methods of claim
 1. 15. The method of claim 14wherein the factor that stimulates INGAP expression was identified bythe methods of claim
 11. 16. The method of claim 13 wherein the factorthat stimulates INGAP expression is selected from hLIF or PMA